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t This  report  presents  two  methods  for  on-line  and  real  time  estimation  of 
parameters  for  application  to  aircraft,  missiles,  surface  and  subsurface  marine 
vehicles  and  various  subsystems.  The  first  method  is  based  on  the  instrumental 
variables  approach.  A new  technique  for  choosing  the  instrument  matrices  is 
developed.  This  technique  ensures  algorithm  convergence  even  with  bad  initial 
conditions  and  when  the  system  parameters  change.  The  second  method  uses  the 
sensitivity  functions  reductions  technique  in  the  maximum  likelihood  approach  to 
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20.  (Continued)  give  a viable  on-line  or  real  time  method.  A maximum  likeli 
hood  approach  is  also  suggested  for  nonlinear  systems.  The  trade  offs  of 
the  two  methods  are  discussed. 

These  techniques  are  applied  to  simulation  and  flight  test  data. 
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I . INTRODUCTION  AND  SUMMARY 


1.1  INTRODUCTION 


The  requirement  to  reduce  test  and  evaluation  time  has  become  a primary 
objective  in  Navy  systems  development.  This  requirement,  however,  must  be 
applied  to  increasingly  more  complicated  and  sophisticated  hardware,  such 
as  aircraft,  surface  and  subsurface  marine  vehicles,  and  inertial  navigation 
components.  The  result  is  increasing  dependence  on  advanced  data  acqusition 
systems  by  research,  test,  and  evaluation  agencies.  To  maximize  the  effec- 
tiveness of  such  data  acquisition  capabilities,  specialized  software  algo- 
rithms are  needed  to  provide  test  system  evaluation  parameters  in  the 
minimum  engineering  and  test  time.  These  software  programs,  for  example, 
are  developed  to  quantify  aircraft  performance,  calibrate  instrumentation, 
and  monitor  critical  subsystems  (e.g.,  engines). 


One  important  type  of  algorithm  is  parameter  identification.  These 
algorithms  use  test  data  to  estimate  the  fundamental  parameters  which  govern 
the  response  characteristics  of  the  particular  system  being  evaluated. 
Advanced  parameter  identification  techniques,  for  example,  have  been  imple- 
mented and  successfully  used  to  reduce  evaluation  time  for  stability  and 
control  parameters  of  advanced  Naval  aircraft  [1] . These  applications  have 
demonstrated  the  potential  benefits  of  even  more  advanced  parameter  identi- 
fication algorithms  which  can  be  used  for  real  time  calculations  (e.g., 

"as  the  data  comes  in").* 


For  the  purposes  of  this  report,  "real  time"  implies  available  calculation 
results  within  a test  data  length;  "near  real  time"  implies  results  which 
are  available  sufficiently  soon  after  a particular  data  length  that  sequen 
tial  data  lengths  can  be  continually  monitored  without  interruption;  "on- 
line" implies  that  sensor  data  is  passed  to  processing  computers  as  the 
data  is  generated;  "off-line"  implies  storage  of  data  to  be  processed  at 
a later  time. 


■ 


Such  algorithms  cannot  only  be  used  for  substantial  reductions  in 
testing  and  processing  time,  but  also  for  system  diagnostic  monitoring 
and  control.  A summary  of  such  applications  is  given  in  Table  1. 

The  objective  of  the  analyses  conducted  for  this  effort  has  been  to 
develop  the  fundamental  theoretical  and  applications  methods  for  provid- 
ing real  and  near -real  time  parameter  estimates  from  test  data.  The 
principal  results  are  sunmarized  in  the  following  section. 

1.2  PRINCIPAL  RESULTS 

There  are  basically  two  approaches  to  developing  a real  time 
parameter  identification  algorithm.  These  are  as  follows: 

a.  Simplification  of  an  Off-Line  Algorithm:  In  order  to  optimally 

process  a data  set,  a minimal  number  of  assumptions  are  evoked. 
The  result  is  a complex  computational  algorithm  requiring 
sufficient  execution  time  and  machine  memory  allocation  that 
only  post-test  calculations  are  practical.  By  restricting  the 
test  system  analytical  complexity  (e.g.,  limiting  to  linear 
systems) , by  making  the  program  more  dependent  on  a priori 
start-up  data,  and  by  a general  "streamlining"  of  the  program 
structure,  significant  reductions  in  computer  resources  are 
achievable. 

b.  Development  of  a Real  Time  Algorithm:  An  alternate  approach 

is  to  formulate  a parameter  identification  algorithm  within  the 
constraint  of  real-time  capability.  This  approach  therefore 
evokes  simplifying  assumptions  af  the  beginning,  instead  of  the 
end,  of  the  algorithm  development. 

In  summary,  one  may  simplify  an  off-line  algorithm,  which  was 
developed  for  accuracy,  not  speed,  or  one  may  specifically  develop  a fast 
algorithm,  with  possible  compromises  on  accuracy.  Both  approaches  have 
been  developed  and  evaluated  for  this  work.  In  particular,  the  following 
methods  have  been  developed  and  applied  to  flight  test  data. 


Table  1 

Summary  of  Applications  of  On-Line  Parameter  Identification  Systems 


disturbances.  On-line  parameter  identi- 
fication is  basic  to  this  task. 


a*  Near  Real  Time  Maximum  Likelihood  Algorithm:  The  implementation 

of  a new  algorithm  into  an  advanced  maximum  likelihood  parameter 
identification  program  has  been  found  to  reduce  execution  time 
by  a factor  of  5:1.  Denoted  as  sensitivity  function  reduction, 
this  algorithm  provides  the  potential  of  even  further  minimiza- 
tion of  execution  time  for  linear  systems. 

b.  Real  Time  Instrumental  Variable  Method:  A new  formulation  of  a 

sequential  least  square  filter  based  on  the  statistical  method 
of  instrumental  variables,  has  produced  a real  time  capability 
for  providing  parameter  estimates  for  linear  systems. 

In  addition  to  these  implemented  algorithms,  three  other  related 
algorithms  have  been  developed  for  possible  future  implementation.  These 
are  as  follows: 

a*  Real  Time  Maximum  Likelihood  Method:  A unique  reformulation  of 

the  maximum  likelihood  method  has  been  completed.  This 
algorithm  promises  estimates  of  high  statistical  efficiency. 

b.  On-Line  Data  Consistency  Analysis:  A recurring  problem  in 

system  testing  is  the  poor  quality  of  certain  key  data  from 
low  accuracy  or  even  partially  failed  instruments.  Miscali- 
brations,  data  channel  losses,  or  spurious  noise  inputs 
contribute  to  such  problems.  To  alleviate  attempts  to  process 
such  data,  whose  results  may  be  marginal,  a method  for  assess- 
ing the  relative  accuracy  of  data  channels  has  been  developed. 

c.  On-Line  Data  Quality  Analysis:  One  further  step  in  data  quality 

analysis  is  the  determination  of  the  information  about  a desired 
parameter  set.  This  quality,  denoted  as  identifiability,  pro- 
vides an  assessment  of  the  test  input,  instrumentation  accuracy, 
and  correctness  of  the  assumed  model  for  the  system  under  test. 

A new  technique  for  evaluating  the  identifiability  of  data  has 
thus  been  formulated  to  improve  the  overall  identification  process. 


1.3  SUMMARY 


This  report  details  the  theoretical  basis  of  real  time  parameter 
identification  algorithms.  Of  necessity,  the  formulation  of  these  algorithms 
is  of  a basic  mathematical  nature,  and  Chapter  II  provides  the  background 
concepts  and  definitions  upon  which  the  subsequent  algorithmic  developments 
rest.  Chapter  III  details  a new  on-line  instrumental  variable  method  which 
overcomes  many  problems  encountered  in  previous  work.  Also  presented  is  a 
unique  maximum  likelihood  algorithm  for  real  time  processing  requirements. 
Chapter  IV  presents  the  implementation  of  the  instrumental  variable  method 
and  results.  Some  results  on  on-line  evaluation  of  data  quality  are  given 
in  Chapter  V. 


II.  REQUIREMENTS  FOR  ON-LINE  PARAMETER  IDENTIFICATION 
AND  REVIEW  OF  PREVIOUS  METHODS 


This  chapter  reviews  the  identification  problem  with  respect  to  typical 
computational  requirements  imposed  by  accurately  estimating  a large  number 
of  parameters  (Section  2.1).  Then,  a background  of  mathematical  terminology 
and  system  representations  is  provided  to  classify  the  constraints  which 
must  be  considered  in  reducing  these  computational  requirements  (Sections  2.2 
and  2.3).  A review  of  previous  approaches  is  then  presented  which  introduces 
the  developments  of  the  following  Chapter  III. 

2.1  COMPUTATIONAL  REQUIREMENTS  FOR  ADVANCED  PARAMETER  IDENTIFICATION 

ALGORITHMS 

In  general,  there  is  no  universal  specification  for  the  length  of  time 
or  computer  capability  needed  to  estimate  the  coefficients  of  a specific 
system  from  input -output  data.  Such  computational  requirements  are  functions 
of  the  type  of  system  (e.g.,  linear  or  nonlinear,  dimension  of  the  system), 
the  data  length,  the  number  of  parameters  to  be  identified,  and  the  estimate 
accuracy  desired.  These  are  the  basic  factors  governing  the  computer 
requirement.  In  general,  as  also  discussed  in  Chapter  I,  there  is  a trade 
off  between  computer  requirements  and  estimate  accuracy.  This  trade  off 
has  not  be  quantified  in  general  due  to  the  difficulty  (and  expense)  of 
trying  to  do  so  for  a practical  problem,  with  many  state  and  parameters  to 
be  identified,  over  all  possible  types  of  computers,  algorithms,  and  data 
lengths . 

For  reference,  a typical,  highly  accurate  off-line  parameter  identifi- 
cation program  can  be  used  to  establish  a basis  for  discussing  computer 
requirement  reduction.  A general-purpose  (e.g.,  linear  or  nonlinear  system 
capability)  maximum  likelihood  computer  program,  for  example,  has  a computer 
time  requirement  as  shown  in  Figure  1.  This  computer  program  will  identify 
an  arbitraily  large  number  of  parameters  of  the  system  dynamics,  the  measure - 
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ment  instrumentation  errors,  and  process  noise  characteristics.  The  system 
may  have  linear  or  nonlinear  dynamics  and  the  program  is  designed  to  be  able 
to  treat  nonlinear  aerodynamics  with  coding  modification.  Figure  1 shows, 
for  two  cases  of  data  point  number  N,  the  approximate  increase  in  execution 
time  versus  number  of  parameters  identified.  For  exanple,  about  six  minutes 
are  required  (on  a UNIVAC  1108)  to  estimate  ten  parameters  from  poor  initial 
estimates.  This  program  has  not  been  speed  optimized,  however,  by 
programing  some  of  the  subroutines  in  machine  language  or  specializing 
the  coding  to  that  required  for  a linear  system. 

The  characteristic  times  shown  in  Figure  1 may  not  be  satisfactory  for 
operational  on-line  use  of  the  program.  The  question  is  then  as  to  whether 
the  slope  of  this  characteristic  can  be  decreased,  and  what  accuracy  com- 
promises are  required,  if  any,  in  so  doing. 


It  will  be  shown  in  subsequent  chapters  that  the  characteristic  slope 
of  Figure  1 can  in  fact  be  reduced,  at  no  loss  in  accuracy  if  the  program 
is  applied  only  to  linear  systems.  Before  proceeding  to  these  new  results, 
however,  it  is  necessary  to  establish  more  precise  meanings  of  parameter 
estimate  accuracy,  and  review  some  previous  work. 


2.2  PROPERTIES  OF  PARAMETER  ESTIMATES  AND  IDENTIFICATION  ALGORITHMS 


To  illustrate  the  various  properties  of  parameter  estimates  and 
identification  algorithms,  consider  a system  parameterized  on  m parameters 
whose  true  values  6 are  not  known.  Suppose  that,  based  on  a certain  set 
of  data,  x,  an  identification  technique  gives  the  estimate  9 for  9.  In 
general,  we  are  interested  in  knowing  how  close  the  estimates  are  to  the 
true  values  of  parameters.  Various  indicators  of  the  closeness  of  9 and 

A 

9 are  described  by  the  statistical  properties  of  the  estimate  9,  some  of 
which  are  described  as  follows  [2]. 
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COMPUTER  TIME  (MINUTES) 


Figure  1 Computer  Execution  Time  Versus  Number  of  Parameters 
Identified  for  a General  Purpose  Maximum  Likelihood 
Program 
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1.  Bias:  0 is  an  unbiased  estimator  of  0 if 

E(0)  = 0 (2.1) 

where  E stands  for  the  "expected  value".  In  physical  terms, 
this  implies  that  if  the  data  x were  collected  a large  number 
of  times,  the  mean  of  the  different  estimates  0 is  0. 

2.  Minimum  Variance:  0 is  a minimum  variance  estimator  of  0 if 

for  any  other  estimator  0 

E{0  - E(0)}2  < E{0  - E(0)}2  (2.2)  I 

This  implies  that  0 would  show  the  minimum  variation  if  the 
parameters  were  estimated  a large  number  of  times  using  data 
from  different  runs. 

3.  Efficient : 0 is  an  efficient  estimator  of  0 if  it  is  un- 
biased and  if  for  any  other  unbiased  estimator  0 of  0 

E(0  - 0)2  < E(0  - 0)2  (2.3) 

4.  Consistent : 0 is  a consistent  estimator  of  0 if  the  accuracy 

of  the  estimate  improves  with  increasing  amount  of  data  and 
approaches  the  true  value  as  the  amount  of  data  tends  to  infinity. 

In  other  words,  for  any  positive  n and  e 

p(|0n  - 0|  < e)  > 1 - n n > N (2.4) 

This  definition  is  analogous  to  the  definition  of  convergence 
in  real  analysis. 
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In  addition  to  these  statistical  properties,  the  convergence  of 
the  algorithm  should  be  assured  even  with  poor  starting  values  and  when 
the  parameters  change  either  slowly  or  abruptly.  This  is  extremely 
important,  much  more  important  than  in  an  off-line  algorithm,  because 
the  on-line  parameters  may  be  used  in  critical  real  time  tasks  where 
human  intervension  may  be  undesirable  or  infeasible.  The  convergence 
of  the  parameter  estimation  algorithm  must,  in  other  vords,  be  robust 
with  respect  to  starting  parameter  values. 

At  this  point,  we  also  define  the  concept  of  "probability  in  the 
limit".  Let  yn  be  a random  variable,  which  is  a function  of  n.  Then 
the  probability  in  the  limit  of  yn  is  y,  if  the  distribution  of  yn 
collapses  to  a single  point  y as  n becomes  large.  This  is  different 
than  the  convergence  of  the  expected  value  of  yn  to  y. 

2.3  TWO  REPRESENTATIONS  OF  THE  DYNAMIC  Si' STEM 

Two  main  representations  have  been  used  for  linear  time  invariant 
systems,  which  can  be  described  by  a Markov  process.  These  are:  (a) 

state  space  form,  and  (b)  autoregressive  moving  average  (ARMA)  form. 

2.3.1  State  Space  Representation 

The  central  concept  in  the  state  space  representation  of  the  sys- 
tem is  the  definition  of  state.  At  any  point  in  time,  it  suimvarizes 
the  past  history  of  the  system  from  the  viewpoint  of  predicting  the  future 
response.  Examples  of  state  are  the  position  and  velocity  of  a particle 
acted  upon  by  a force.  If  x is  the  state  and  y is  the  measurement, 
then  the  system  representation  in  discrete  form  is 


x(k+l)  = 4>x(k) 

+ Gu(k)  + Tw(k) 

(2.5) 

xfk)  = Hx(k) 

+ v(k) 

(2.6) 
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The  state  equations  (2.5)  are  driven  by  inputs  u and  random  noise  w 
(called  process  noise).  Note  that  the  determination  of  x(k+l)  requires 

the  knowledge  of  u(k) , w(k)  and  x(k)  but  not  x(l),  x(2) 

x(k-l).  p,  G and  r describe  the  behavior  of  the  system.  The  measure- 
ments y are  also  contaminated  by  random  noise.  The  matrix  H describes 
the  instrumentation  system.  The  number  of  measurements  can  be  greater 
than,  less  than  or  equal  to  the  number  of  states.  This  system  can  be 
written  in  the  innovations  representation 


x(k+l) 

= <J>x(k)  + Gu(k)  + (j)Kv(k) 

(2.7) 

yOO 

II 

£> 

2 

+ 

c 

f — \ 

X1 

V — / 

(2.8) 

v(k)  are  the  innovations  and  are  vhite,  and 
In  state  estimation  context,  the  innovations 

K is  the  Kalman  gain, 
represent  new  information 

brought  in  by  each  measurement.  The  innovations  representation  is 
particularly  useful  in  estimation  and  identification  problems,  since 

A 

x(k)  is  the  only  component  of  x(k)  which  can  be  estimated. 

2.3.2  Autoregressive  Moving  Average  (AKMA)  Representation 

For  a multi-output,  multi- input  system,  the  autoregressive  moving 
average  representation  is 

P-1  q 

Z A.  y(k-i)  = Z (B.  u(k-i)  + C-  v(k-i)}  , A = C,  = I (2.9) 
i=0  1 i=l  1 1 ol 


-j 


where  v(k)  is  white  random  noise,  p is  the  order  of  the  autoregressive 
part  and  q is  the  order  of  the  moving  average  part.  Notice  that  there 
is  no  concept  of  state  in  this  representation.  In  effect,  the  descriptions 
of  the  system  and  the  instrumentation  have  been  combined  in  one  equation. 
Equation  (2.9)  shows  directly  the  effect  of  the  input  and  the  random 
noise  on  the  output  of  the  system. 
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2.3.3  Relationship  Between  the  State  Space  Representation  and  ARMA 
Representation 

A system  which  can  be  described  by  state  space  representation  can 
also  be  described  by  an  ARMA  representation  and  vice  versa.  However, 
there  are  important  differences,  which  often  make  the  choice  of  one  form 
more  desirable  over  the  other.  It  should  be  mentioned  that  certain 
canonical  forms  in  state  space  representation  are  directly  reducible  to 
ARMA  representation.  On  the  other  hand,  by  a suitable  selection  of  the 
state  vector  the  ARMA  representation  can  be  converted  into  a state  space 
■*  form. 


The  state  space  representation  results  from  a knowledge  of  the 
physical  laws  which  govern  the  system  behavior.  Therefore,  the  system 
state  and  instrument  outputs  are  modeled  separately.  The  4>,  G,  T and 
H have  a specified  structure.  Examples  of  this  are  various  physical 
processes.  The  ARMA  representation  is  a natural  consequence  of  systems 
in  which  the  time  series  does  not  follow  any  well  defined  or  known  law. 
Examples  of  this  are  various  economic  series,  biomedical  time  series 
and  complicated  physical  processes.  In  each  of  these  ARMA  cases,  a state 
variable  model  in  canonical  form  could  also  be  applied. 

In  short,  the  state  space  and  the  ARMA  representations  should  be 
considered  as  two  complimentary  techniques  to  mathematically  describe 
the  response  of  linear  dynamic  systems. 

2.4  REVIEW  OF  PREVIOUS  ON-LINE  IDENTIFICATION  TECHNIQUES 

Several  on-line  algorithms  have  been  proposed  in  the  past  for  both 
the  state  variable  and  the  autoregressive  moving  average  representations. 
These  methods  have  tended  to  concentrate  on  ARMA  representations  and, 
to  avoid  the  identifiability  problem,  assumed  all  parameters  unknown 
and  identifiable.  The  problems  of  statistical  efficiency  and  bias  and 
algorithm  divergence  have  been  addressed  by  various  authors  but  are  not 
completely  resolved.  There  are  a number  of  schemes  to  make  the  algorithms 
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recursive.  These  methods  are  covered  in  two  recent  survey  papers  by 
Saridis  [3]  and  by  Isermann,  et  al.  [4].  Some  of  the  recently  proposed 
on-line  schemes  are  in  Reference  5. 


All  on-line  methods  can  be  classified  as  follows: 

(1)  Least  Squares  (LS)  and  Generalized  Least  Squares  (GLS) , 
Including  Two  Stage  Least  Squares  (2SLS)  ^<3  Three  Stage 
Least  Squares  (3SLS) . 

(2)  Stochastic  Approximations  (SAP). 

(3)  Kalman/Filter  Smoother  (KF/S) . 

(4)  Instrumental  Variables  (IV) . 

(5)  On-Line  Maximum  Likelihood  (OLML) . 

The  least  squares  methods  have  been  in  use  for  a long  time,  both 
in  econometrics  and  in  control  applications.  The  oasic  appeal  of  these 
methods  is  their  simplicity.  The  difference  between  the  deterministic 
parts  of  the  left  hand  side  and  the  right  hand  side  of  the  equation  is 
minimized.  For  the  ARMA  representation,  A^  and  are  chosen  to 
minimize 


I A.  y(k-i)  - E B u(k-i)  ‘ 
i=0  1 i=l  1 * 


(2.10) 


An  explicit  solution  is  straightforward.  There  are  two  problems 
with  the  estimate  obtained  by  minimizing  (2.10).  First,  since  the 
measurements  are  correlated  with  noise,  the  estimates  are  biased. 
Secondly,  since  the  measurement  noise  sequence  is  moving  average,  the 
estimates  are  inefficient.  A method  similar  to  Equation  (2.10)  can 
be  used  for  state  variable  model. 
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Atteir->ts  at  removing  this  bias  and  inefficiency  in  parameter  esti- 
mates has  given  birth  to  the  generalized  least  square  and  tvo  and  three 
stage  least  square  methods.  After  the  parameters  A^  and  are 
estimated,  the  residuals  are  determined.  The  residuals  are  nonwhite 
and  a model  is  fitted  to  the  residuals.  Consider  a single- input,  single- 
output ARMA  representation  and  let  a^  denote  the  estimated  value  of 
a- . Let 


p-1  q 

e(k)  = E a-  y(k-i)  - E b-  u(k-i) 

• _ r\  1 1 


C2.ll) 


e(k)  = n(k)  + Yl  n(k-l)  + y2  n(k-2) 


(2.12) 


where  n(k)  is  a white  noise  process.  The  order  q"+l  is  determined 
from  significance  tests  and  experience,  while  the  parameters  are  deter- 
mined from  another  least  squares.  The  y and  u time  series  are 
passed  through  the  filter  of  Equation  (2.12).  The  noise  is  then  white 
and  a new  set  of  parameters  a^  and  6^  can  be  estimated.  This  pro- 
cess is  repeated  until  the  final  estimates  of  a^,  b^  and  y^  are 
obtained.  This  is  called  the  generalized  least  square  method  [6]. 

The  two  stage  and  three  stage  least  square  methods  follow  a similar 
approach.  The  least  square  methods  can  be  recast  into  a recursive  form 
[6,  7],  making  them  very  useful  for  on-line  and  real-time  applications. 

The  stochastic  approximation  methods  were  developed  by  statisti- 
cians and  introduced  to  the  control  literature  by  Saridis  and  others  [8]. 

In  essence,  the  stochastic  approximation  methods  are  extensions  of  gradient 
methods  to  stochastic  problems.  Their  application  in  on-line  and  real  time 
identification  is  motivated  by  the  fact  that  they  can  be  converted  into 
recursive  methods,  in  which  the  parameters  are  corrected  by  an  appropriately 
weighted  error  correction  term  based  on  the  observed  and  the  expected  output. 
The  correction  term  can  be  chosen  in  a variety  of  ways.  Its  main  appeal  is 
simplicity  and  flexibility  in  implementation.  The  parameter  estimates  are 
consistent,  but  usually  inefficient.  To  make  the  method  more  efficient, 
second  order  and  accelerated  stochastic  approximation  algorithms  have  been 
developed.  The  efficiency  is  still  poor. 
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The  Kalman  filter/ smoother  approach  for  system  state  estimation  has 
found  some  application  in  on-line  and  real  time  parameter  identification. 

The  parameters  are  appended  to  the  state  vector  to  give  a larger  augmented 
state.  This  converts  an  otherwise  linear  system  into  a nonlinear  one.  The 
augmented  state  is  estimated  using  the  Kalman  filter  approach  for  systems 
governed  by  nonlinear  differential  equations.  The  parameters  and  states 
can  be  simultaneously  estimated  on-line.  This  method  has  not  found  a wide 
application  because  of  several  problems.  The  estimates  are  biased  even  in 
simple  systems.  To  start  the  algorithm,  good  estimates  of  parameters,  their 
covariances,  and  process  and  measurement  noise  covariances  are  required. 

The  algorithm  tends  to  diverge  and  neglect  measurements  after  a certain  time 
because  parameters  constitute  undisturbable  states.  To  avoid  this  divergence, 
arbitrary  white  noise  terms  are  added  to  the  parameter  equations.  Thus, 
the  equations  governing  the  parameters  become 

e(k+l)  *efk)  + nOO  (2.13) 


where  n(k)  is  a white  noise  process.  The  equations  governing  the  augmented 
state  are 


’ x(k+l)  ' 

a' 

’x(k)  ' 

’ G * 

u(k)  + 

\r  j_o‘ 

w(k)  ' 

9 (k+1) 

6 - 

i 

_6(k)  _ 

+ 

0 

0 ! I 

i 

_n(k) 

(2.14) 


yOO  = [H  : o] 


x(k) 

e’do 


v(k) 


(2.15) 


Matrix  A depends  upon  the  state  and  is  obtained  by  linearizing  the  first  two 
terms  on  the  right  hand  side  of  Eq.  (2.5)  around  the  current  state  and  control. 
The  Kalman  filter  equations  to  estimate  the  augmented  state  are 


x(k+l)" 

A 

z 

$ j A 

“ - T * " 

x_(k)* 

/A 

+ K(k)  (y  - Hx(k))  + 

" G * 

9 (k+1) 

0 ; I 

9(k) 

- 

0 

u(k)  (2.16) 


The  Kalman  gains  K and  the  estimation  error  covariance  follow  the  usual 
equations.  Note  that  the  Kalman  gain  is  a function  of  time.  The  Kalman 
filter  method  has  been  used  by  Chen  [9]. 


1 


16 


The  instrumental  variable  method  is  an  extension  of  least  squares 
techniques  which  are  easy  to  implement  but  result  in  biased  estimates 
whenever  the  measurements  are  noisy.  The  instrumental  variables  technique 
was  developed  for  econometric  applications  to  combine  the  ease  of  least 
squares  type  methods  with  a technique  to  give  unbiased  estimates.  This 
is  done  in  a very  elegant  way  by  the  choice  of  the  instrumental  variables. 
The  instrumental  variables  are  chosen  such  that  they  are  independent  of 
the  measurement  noise.  Consider  a single  output  ARMA  process.  Choose 
z(k)  as  the  vector  of  instrumental  variables.  Then, 

z(k)  y(k)  = z(k)  {y(k-l),  y(k-2) y(k-p+l),  u(k-l) u(k-q)}0 

+ z(k)  n(k)  (2.17) 

where  0 is  the  vector  of  unknown  parameters. 

Summing  this  for  the  N data  points 


N N N 

E z(k)  y(k)  = E z(k)  y*(k)0  + I z(k)  nOO 

k=l  k=l  k=l 


y*(k)  A (y(k-l),  y(k-2) , . . . ,y(k-p+l) , u(k-l) , . . . ,u(k-q) } 


and  we  can  estimate  0 using  the  equation 


„ ( N )-l  N 

0 = < E z(k)  y*  (k)  > I z(k)  y(V.) 

[ k=l  ) k=l 


assuming  that  the  first  matrix  is  invertible. 
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It  is  straightforward  to  show  that 
E(6)  = 0 

N 

iff  E L n(k)  z(k)  - 0 (2. 20) 

k=l 

In  other  words,  the  estimate  of  0 is  unbiased  if  the  instrumental  variables 
are  chosen  to  be  independent  of  the  noise  process.  This  algorithm  can  be 
used  with  state  space  models  also.  For  on-line  applications,  Eq.  (2.19)  can 
be  converted  into  a recursive  form.  The  main  problem  in  this  method  is  the 
choice  of  the  instrumental  variables.  The  technique  has  been  described  by 
Wong  and  Polak  [10]  and  applied  by  Young  [11]  and  Pandaya  [12].  Both  these 
techniques  achieve  efficiency  in  certain  cases,  but  do  not  ensure  convergence. 
Moreover,  the  results  of  Pandaya  and  Young  are  applicable  to  single  output 
systems  in  ARMA  representation. 

The  efficiency  of  the  off-line  maximum  likelihood  approach  [13]  has 
lead  to  several  on-line  schemes  for  parameter  identification  based  on  maxi- 
mizing the  likelihood  function  [14].  Basically,  these  techniques  compute 
the  first  and  the  second  gradients  of  the  likelihood  function  and  then  take 
a single  step  of  the  Newton -Raphson  gradient  algorithm.  This  step  can  be 
updated  as  new  measurements  are  received.  There  are  two  problems  with  past 
on-line  maximum  likelihood  techniques.  First,  no  attention  is  given  to  the 
efficient  computation  of  the  gradients  of  the  likelihood  function  except  in 
very  simple  autoregressive  or  autoregressive  moving  average  models.  Second, 
since  the  likelihood  function  is  not  exactly  quadratic  at  the  optimum 
parameter  values,  a single  step  of  the  Newton- Raphson  does  not  provide 
sufficiently  accurate  parameter  estimates.  The  advantage  is  that  all 
parameters  in  the  system  model  are  not  considered  unknown. 
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2.5  SUMMARY 

Considerable  work  has  been  done  in  the  development  of  on-line  and  real 
time  algorithms  for  identification.  Nevertheless,  it  is  clear  from  the 
brief  discussion  of  this  chapter  that  many  important  problems  have  been 
left  unanswered.  The  need  for  developing  algorithms  which  are  statistically 
efficient  and  have  guaranteed  convergence  cannot  be  overstressed.  Further 
work  is  also  required  in  algorithms  for  state  variable  models.  For  specific 
applications,  the  algorithms  may  be  tailored  to  particular  computers  and 
systems . 

In  the  work  reported  in  the  next  chapter,  convergent  and  efficient 
algorithms  have  been  developed  for  state  variable  models.  Two  approaches 
are  considered:  (a)  the  instrumental  variables  approach,  and  (b)  the 

maximum  likelihood  approach.  Attention  is  also  given  to  computation  time 
and  storage. 


III.  ON-LINE  AND  REAL  TIME  IDENTIFICATION  OF  PARAMETERS 


3.1  INTRODUCTION 

The  last  chapter  presented  a brief  discussion  of  the  several  existing 
on-line  techniques.  Various  trade  offs  in  choosing  specific  on-line  algo- 
rithms are  also  discussed.  It  is  concluded  that  there  is  a need  for 
statistically  efficient  and  convergent  algorithms  for  on-line  and  real  time 
identification  of  parameters  in  state  variable  models.  It  is  desired  that 
such  algorithms  be  implementable  within  small  storage  and  confutation  time. 

Two  techniques  have  been  developed  to  this  end.  They  are:  (a)  the 
instrumental  variable  approach,  and  (b)  the  on-line  maximum  likelihood  method. 
Attention  is  given  to  statistical  properties  like  the  unoiasedness,  effi- 
ciency, and  consistency.  Convergence  characteristics  of  these  algorithms 
are  also  studied  and  improved,  in  some  cases,  with  a slight  degradation 
in  the  statistical  properties.  Computation  time  and  storage  requirements 
are  minimized  so  that  these  techniques  can  be  used  in  real  time  application. 

Section  3.2  introduces  the  instrumental  variable  approach.  Two  cases 
are  considered.  In  the  first  case,  all  the  state  variables  are  measured 
and  in  the  second,  only  some  of  the  states  are  measured.  Identifiability 
problems  are  also  discussed.  This  is  followed  by  a discussion  of  the  on-line 
maximum  likelihood  methods  in  Section  3.3.  On-line  methods  are  given  for 
both  linear  and  nonlinear  systems.  The  last  section  discusses  the  relation- 
ship between  the  instrumental  variables  and  on-line  maximum  likelihood 
methods  and  gives  the  advantages  and  disadvantages  of  each  method. 
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3.2  THE  INSTRUMENTAL  VARIABLES  APPROACH 

As  introduced  in  Chapter  II,  the  instrumental  variable  approach  is  an 
equation  error  method  in  which  the  bias  of  the  least  square  methods  is 
removed  by  proper  choice  of  the  instrument  matrix.  The  techniques  of  Wong 
and  Polak  [10],  Young  [11],  and  Pandaya  [12]  can  be  used  with  a single- 
input single-output  system  in  canonical  form  or  for  a scalar  autoregressive 
moving  average  (ARMA)  process.  Moreover,  their  methods  are  applicable  only 
for  systems  in  which  the  state  equation  or  the  ARMA  equation  is  noise  free 
but  the  outputs  are  contaminated  by  colored  noise.  All  these  authors 
select  efficient  instrumental  variables  with  a possibility  of  divergence. 
Only  Pandaya  gives  attention  to  the  divergence  problem.  The 
technique,  developed  here,  can  be  used  with  general  multi-input  multi - 
output  systems  when  both  process  noise  and  measurement  noise  are  present. 
The  technique  gives  special  attention  to  the  convergence  properties  of  the 
algorithm. 

3.2.1  Problem  Statement 


Consider  a discrete  time  linear  system 
x(k+l)  = <t>x(k)  + Gu(k)  + w(k) 

x(0)  = xQ  (3.1) 

and  the  measurements 

z(k)  = Hx(k)  + Du(k)  + v(k)  (3.2) 

where  x(k)  is  an  n x 1 state  vector,  u(k)  is  a q x 1 input  vector,  and 
y(k)  is  a m x 1 output  vector.  w(k)  and  v(k)  are  white  noise  sources 
such  that 


22 


mm 


T.77 


ECw(k))  = 0 

E(v(k))  =0  k = 1,  2,  . . . 


and 


E(w(k)  wT(0)  = Q 6kj£ 

E(v(k)  vT(l))  = R 6k(l 

E(w(k)  vrU))  =0  k,»  - 1,  2,  . . . 


(3.3) 


(3.4) 


, G are  either  constant  or  slowly  varying  matrices  and  are  not  known.  The 
measurement  distribution  matrices  H and  D are  assumed  known.  An 
alternate  representation  of  the  system  is  in  terms  of  the  innovations  and 
predicted  state  estimate 

x(k+l)  = <jix(k)  + Gu(k)  + 4>Kv(k) 

z(k)  = Hx(k)  + Du(k)  + v(k)  (3.5) 

where  K is  the  Kalman  gain  which  in  ''steady  state"  follows  the  equations 
(constant  <j>,  r,  Q,  R and  H) 

P = ♦(I-KH)P*1  + rQr1 

K = mT(HPHr  + R)'1  (3.6) 

where  x(k)  is  the  expected  value  of  the  state  x(k)  given  the  measure- 
ments z(l),  z(2)  . . . z(k-l).  It  is  well  known  that  the  innovation 
sequence  v(l),  v(2)  . . . v(k)  is  a white  noise  process.  Since  both 
z(k)  and  u(k)  are  known,  it  is  possible  to  rewrite  the  measurements  by 
subtracting  the  contribution  of  linear  combination  of  u . Thus  it  is 
general  to  assune  that  D is  .'ero. 
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The  on-line  identification  problem  consists  of  tracking  parameters 
in  $,  G and  possibly  noise  sources.  For  sake  of  simplicity,  we  consider 
two  cases:  (a)  Case  1,  where  matrix  H has  rank  n,  and  (b)  Case  2, 

where  matrix  H has  rank  smaller  than  n . The  first  case  leads  to  a 
simple  instrumental  variable  algorithm. 

3.2.2  Instrumental  Variable  Approach  When  the  Observability  Index  is 
One  (Rank  (H)  = n) 

In  aircraft  application,  usually,  there  are  many  measurements 
available.  In  most  cases  it  is  possible  to  reconstruct  a noisy  estimate 
of  the  state  at  any  point  by  using  noisy  measurements  at  that  point. 

This  implies  that  the  observability  index  is  one  and  the  rank  of  H is 
equal  to  the  number  of  states.  This  is  possible  only  if  the  number  of 
measurements  is  not  smaller  than  the  dimension  of  the  state  vector. 

When  rank  (H)  is  n , for  any  positive  definite  m x m matrix 
* TA-1 

R,  H R H is  invertible.  Premultiplying  Equation  (3.2)  by 

T * (HTR- XHJ " 1 li'p/1  (3.7)  I 

we  have 


z,p(k)  = x(k)  + v,p(k) 

where 

Zj  = Tz  etc. 

and  E(vT(k)  v.[(«.))  = TRT1^  ; 

As  a special  case  when  H is  square  and  of  rank  n 
T = H'1 
24 


(3.8) 


(3.9) 

(3.10) 


(3.11) 


. 


It  is  clear  from  (3.7)  and  (3.10)  that  if  R = R 

E(vt0c)VjU)  « HTR"1H6k  l (3.12) 

Since  the  Fisher  information  matrix  for  a set  of  measurement?  is  an  inte- 
T -1 

gral  of  H R H,  it  is  clear  that  the  above  transformation  on  the  measure- 

/N 

ments  maintains  the  information  content  if  R = R.  In  general , R is  not 
known,  so  the  best  possible  approximation  for  R should  be  used  in  the 

A 

transformation  (3.7).  (It  can  be  shown  that  any  other  R , which  is  not  a 
multiple  of  R,  reduces  the  information.)  From  now  on  in  this  section, 
subscript  T will  be  removed  from  the  variables  and  Eq.  (3.2)  with  number  of 
measurements  equal  to  number  of  states  and  H equal  to  a unity  matrix  will 
be  considered  to  be  the  general  equations.  Using  (3.8)  and  (3.1): 


z(k+l)  = (k)  + Gu(k)  + w(k)  + v(k+l)  - <$v(k) 

(3.13) 

Let 

yoo 

(3.14) 

Lu(k)J 

0 = [<fr;G] 

(3.15) 

and 

n(k)  = w(k)  + v(k+l)  - <f>v(k) 

(3.16) 

Therefore,  (3.11)  becomes 

z(k+l)  = 0y(k)  + n(k)  (5.17) 

We  now  consider  the  choice  of  the  instrumental  variables.  Their  role 
has  been  explained  in  the  previous  chapter.  To  be  useful,  they  must  satisfy 
the  following  conditions: 

(a)  Since  the  noise  term  in  Eq.  (3.17)  is  a linear  combination  of  w(k) , 
v(k),  and  v(k+l),  it  is  necessary  that  the  instrumental  variables 
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be  independent  of  these  noise  terms  so  that  the  estimates  are 
unbiased.  In  other  words,  the  selected  instrumental  variables 
should  be  independent  of  the  measurements  yOO,  y(k+l),  y(k+2), 
• • • »y(N) . 

The  instrumental  variables  should  be  correlated  with  y(k).  In 
particular,  if  y^(k)  is  the  instrumental  variables  vector 


1 t 

plim  £ y(k)  yT (k)  should  be  nonsingular 
N k 1 


(3.18) 


where  "plim"  stands  for  probability  in  the  limit.  This  condi- 
tion is  necessary  for  the  estimates  to  be  consistent.  It  will 
be  shown  later  that  the  matrix  in  Eq.  (3.18)  is  related  to  covari- 
ance of  parameter  estimation  errors. 

(c)  The  efficiency  of  the  estimates  depends  on  the  cross-correlation 
between  y(k)  and  yj(k). 

Consider  two  possible  choices  of  the  instrumental  variables  vector 


(3.19) 


x(k)  is  an  estimate  of  x(k)  based  on  z(l) ,z(2) , . . . ,z(k-l) . Note, 
first,  that  both  choices  (a)  and  (b)  satisfy  condition  (a),  ensuring  that 
the  estimates  are  unbiased.  Choice  (a)  does  not  use  parameter  values  to 
determine  the  instruments.  On  the  contrary,  choice  (b)  implicitly  implies 
the  use  of  estimated  parameters  to  obtain  the  best  estimate  of  the  current 
state  from  past  measurements.  In  other  words,  the  estimated  parameters  are 
"bootstrapped"  for  state  estimation  and  determination  of  instrumental  vari- 
ables. 


It  will  now  be  shown  that  both  choices  (a)  and  fb)  have  advantages, 
yj^  (k)  may  not  have  a high  correlation  with  yfk)  because  of  two  reasons: 
(a)  both  z(k-l)  and  zfk)  are  contaminated  with  random  noise,  and  fb)  the 
deterministic  parts  of  zfk-1)  and  zfk)  are  not  the  same.  However,  this 
choice  ensures  that  the  necessary  condition  fb)  for  the  instrumental 
variables  is  satisfied,  because 
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which  is  nonzero  for  persistently  exciting  systems  except  in  trivial  cases. 
Thus,  independent  of  the  initial  parameter  values  or  the  parameter  values 
at  any  point  in  time,  the  instrumental  variables  yj^(k)  ensure  that  the 
resulting  parameter  estimates  are  consistent  but  not  necessarily  very  effi- 


cient. 


Yj  j (k)  uses  the  parameter  estimates  to  predict  the  state  vector  which 

is  a part  of  the  instrumental  variables.  If  the  parameters  are  close  to  the 

true  values,  the  state  estimate  would  be  good  and  the  instrumental  variables 

would  be  efficient.  However,  if  the  parameters  are  far  from  the  true  values, 

the  instrumental  variables  may  not  even  be  consistent.  Rowe  [17]  and  Pan- 

f2) 

daya  [12]  use  instruments  of  the  type  yj  ’ fk)  for  certain  ARMA  models.  Pan- 
daya  has  observed  the  divergence  problems  and  uses  an  adaptive  filter  to 
reduce  the  problem.  However,  the  convergence  of  his  algorithm  cannot  be 
ensured. 

It  is  desirable  to  incorporate  the  best  properties  of  yf^fk)  and 

f2)  1 

yj  ’ fk)  in  the  selection  of  the  instrumental  variables.  This  is  done  by 

introducing  the  Insured  Instrumental  Variables  (I IV) . Two  choices  of  IIV 
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1.  A linear  combination  of  choices  (a)  and  (b) 

Yj  00  = ay^Ck)  + (1  - a)  y[2j  (k) 

0 < a < 1 (3.21) 

2.  After  every  r data  points,  set 

x(k)  = y(k-l)  1 <_  r < °°  (3.22) 

and  in  between  propagate  the  state  equations  to  use  y|2'(k) 
as  the  instruments.  This  way,  even  if  the  filter  were  diverg- 
ing, the  state  estimate  may  not  diverge  in  r sample  periods. 


Both  of  the  above  choices  are  called  insured  instrument  variables  be- 
cause in  each  case,  a penalty  is  paid  to  ensure  convergence.  By  choosing 
a greater  than  zero  in  Eq.  (3.21)  and  r finite  in  Eq.  (3.22),  the  estimates 
are  not  as  efficient  as  they  could  be  if  the  parameters  are  close  to  the 
true  values.  However,  they  are  not  as  bad  as  they  would  be  if  the  para- 
meter values  used  to  predict  the  state  vector  are  very  wrong,  a could 
be  chosen  as  one  to  start  with  and  decreased  as  convergence  occurs. 

Similarly,  starting  from  value  of  one,  r could  be  increased  as  true 
parameter  values  are  obtained. 

Once  the  instrumental  variables  vector  is  chosen,  the  problem  is  to 
use  it  for  determining  parameter  estimates.  If  there  are  N+l  observa- 
tions z(l) ,z(2) , . . . ,z(N+l) , we  can  write  Eq.  (3.17)  compactly  as 


ZN  " 0YN  + nN 

(3.23) 

where 

ZN  A (z(2) , z (3) , 

...,  z (N+l) } 

(3.24) 

and 

Yn  A (y (1 ) , y(2) , 

. . . , y(N) } etc. 

(3.25) 
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Let  Ym  be  the  instrumental  matrix  defined  as 


yn  a [yjCl),  yj(2j,  yj(N)] 


C3.26) 


Postmultiply  (3.23)  by  Y, 


Z Y ^ = 0Y  Y ^ + nY^ 
N*N  N N nN  N 


(3.27) 


An  estimate  of  0 based  on  N+l  observations  is, 


®N  ‘ <ZNfNT)  «nV>'‘ 


(3.28) 


This  can  be  converted  into  a recursive  algorithm 


0N  = 0N-1  + 


(z (N+l)  - ©^^(N)}  pr(N)VN_1 
(1  + yT(N)VN.iy(N)) 


(3.29) 


where 


Vi^yWyW^ 

N N_1  1 + yT(N)VN1y(N) 


VN  ’ ^ V>" 


(3.30) 


(3.31) 


In  many  cases,  the  data  rate  is  fast  and  the  parameter  estimates  are 
not  required  after  every  data  point.  In  that  case,  it  is  computationally 
inefficient  to  use  Fqs.  (3.29)  and  (3.30).  Suppose  the  parameters  have  to  be 
updated  after  every  k measurements.  Equation  (3.28)  can  be  written  as 


0N-k  (\\+k  '‘N+kj  = ZN+k  ^N+k 


(3.32) 
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which  gives 


-T  N+k  „T  - N+k  T 

(0N+k  " °N)  YN+k  YN+k  = . J . z(i+1)  y ^ * eN  1 yC±)yT (i) 

i=N+l  i-N+1 


(3.33) 


Depending  on  k,  various  formulae  can  be  used 


[i)yT(i) ! v„. 


°N+k  = eN  + j4+1  Ci)  - eN  i=Z+i  y(i)y  (i)  J VN+k 


for  large  k 


(3.34) 


' °N  * i-N.1  <2ti*1)  ' V(i))  * H)  Vf(*k 


for  small  k 

VN+k  can  be  obtained  directly  by  inversion  or  Eq.  (3.30)  can  be  used  to  up- 
dating. Again,  for  large  k,  it  is  computationally  more  efficient  to  perform 
a direct  inversion,  while  for  small  k,  Eq.  (3.30)  should  be  used.  These 
trade  offs  can  be  determined  by  counting  the  number  of  computations  required 
per  parameter  update.  The  best  method  depends  on  the  problem  at  hand. 

Many  times  the  parameters  of  the  system  are  changing  slowly.  To  be 
able  to  track  time  varying  parameters,  it  is  necessary  to  "phase  out"  the 
estimate  based  on  old  measurements.  One  method  to  do  this  is  to  use  an 
exponentially  fading  filter  on  past  measurements  leading  to  the  following 
equations. 


0N  = 0N-1  + 


{z (N+l)  - oN_iy(N) > y (n)vn-] 
(p  * yT(N)  VN.J  y(N)) 


(3. 35) 
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0 < p < 1 


(3.36) 
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and 


V = — 
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V, 


VN.J  y(N)  y (N)  vN.x 


N-l 


-T 


(p  + y‘(N)  VN.iy(N)) 


3.2.3  Instrumental  Variables  Approach  When  the  Observability  Index  is 
Greater  Than  One  (Rank  (H)  < n) 

In  many  identification  problems,  the  measurements  of  all  state  vari- 
ables are  not  available.  In  other  words,  it  is  not  easy  to  construct  state 
estimates  at  a point  from  the  measurements  at  that  point.  It  is  well  known 
that  in  this  case,  even  if  H is  known,  it  is  not  possible  to  obtain  esti- 
mates of  all  parameters  in  0 and  G [18|.  Also,  it  is  not  easy  to  deter- 
mine which  set  of  parameters  in  (<f),  G)  are  identifiable.  We  will  assume 
a certain  parameter  set  in  (<p,  G)  is  identifiable. 


Consider  the  case  where  the  system  is  completely  controllable  from  each 
input  and  the  p measurements  y are  the  noisy  measurements  of  the  last 
p states.  Let  and  G^  (i  _>  n-p,  all  j)  be  the  set  of  unknown 

parameters.  It  is  known  that  this  set  of  parameters  is  identifiable  from 
the  measurements.  Other  parameters  in  and  G are  assumed  known.  The 
Type  II  canonical  form  of  Denham  [18]  is  a special  case  of  this  form  and  the 
Type  I canonical  form  can  be  transformed  in  this  form.  Substituting  the 
measurements  in  the  equations  of  motion,  we  have 
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(3.37) 
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Defining, 

0 = ^21  ! ^22  I G2^ 

yT(k)  = [x^(k) , ZT(k) , uT(k)]  (3.38) 

n2(k)  = r2w(k)  - <t22v(k)  + v(k+l) 

we  get 

z(k+l)  = 0y(k)  + n2(k)  (3.39) 

Similarly,  the  first  n-p  equations  in  (3.37)  can  be  written  as 

f z(k)l 

X2(k+1)  - ^^2  00  + (^22  [_u (k) J + ^^w^k)  ■ ^12  v(k) 

(3.40) 

For  the  purpose  of  identification,  Eq.  (3.39)  resembles  Eq.  (3.17)  except  that 
X2(k)  is  not  known  exactly.  The  instrumental  variables  must  be  selected 
based  on  the  same  considerations  as  in  the  last  section.  Since  X2(k)  is 
not  known  exactly  and  there  is  no  measurement  of  these  states,  choices  (a) 
and  (b)  defined  by  Eqs.  (3.18)  and  (3.19)  will  have  to  be  modified  somewhat. 
Therefore  X2(k)  is  determined  by  a direct  integration  of  Eq.  (3.40) 
after  the  noise  terms  are  dropped.  Since  the  parameters  in  this  equation 
are  exactly  known,  the  estimate  of  X2 (k)  is  noisy  but  will  not  diverge 
if  *21  is  stable.  In  choice  (b)  for  the  instrumental  variables,  X2(k) 
and  x2(k)  are  determined  from  the  measurements  in  the  best  possible  way. 

After  the  instrumental  variables  are  chosen,  the  recursive  procedures  of 
Section  3.2.2  can  be  used. 


JJ 


3.2.4  State  Estimation  When  the  Parameters  Are  Not  Known  Exactly 

The  main  problem  in  using  the  instrumental  variables  of  Eq.  (3.19)  is 
the  determination  of  efficient  state  estimates  from  the  measurements  when 
the  system  parameters  are  not  known  exactly,  in  other  words,  we  need  an 
algorithm  to  find  a way  of  determining  a good  estimate  x(k)  given  z(l), 
z(2),  ...,  z(k-l),  recursively. 

If  the  parameters  of  the  system  are  known,  the  best  estimate  x(k) 
can  be  determined  using  the  Kalman  filter,  i.e. , 


x(k+l)  = 4>x(k)  + Gu(k)  + 4>Kv(k) 


(3.40) 


v(k)  = z (k)  - X(k) 


(3.41) 


In  an  actual  implementation,  the  best  estimates  of  4 and  G must  be  used 
in  Eqs.  (3.40)  and  (3.41)  since  the  true  values  are  not  known.  This  algorithm 
is  good  when  the  parameters  are  close  to  the  true  values.  If  this  is  not 
so,  a major  contribution  in  the  innovation  v(k)  is  due  to  the  incorrect 
state  updates  caused  by  erroneous  4 and  G in  Eq.  (3.40).  This  is  particu- 
larly true  when  the  process  noise  is  small  compared  to  the  measurement 
noise  (i.e.,  the  Kalman  filter  gains  K are  small).  If  Eqs.  (3.40)  and  (3.41) 
are  used  since  the  start  of  the  algorithm,  the  estimates  may  diverge 
initially,  giving  inconsistent  estimates.  One  way  to  handle  this  problem 
is  to  account  for  the  incorrect  parameter  values  by  increasing  the  covariance 
of  the  process  noise.  In  addition,  it  may  be  necessary  to  update  the 
Kalman  gain  as  the  parameter  estimates  improve.  An  adaptive  filter  sug- 
gested by  Pandaya  [12]  can  also  be  used.  It  is  important  that  this  filter 
be  good,  because  the  efficiency  of  the  instrumental  variables  depends  on 
the  correlation  between  y(k)  and  ?(k). 
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3.2.5  Parameter  Estimation  Error  and  Identifiability 


In  Sections  3.2.2  and  3.2.3,  we  discussed  the  set  of  parameters  which 
are  structurally  identifiable.  However,  the  input  may  be  such  that  it  does 
not  excite  all  the  modes  of  the  system.  It  is  clear,  then,  that  it  will 
be  impossible  to  get  good  estimates  of  some  of  the  system  parameters.  To 
determine  this  on-line,  we  need  to  know  the  accuracy  with  which  the  para- 
meters are  estimated.  From  Eqs.  (3.27)  and  (3.28),  it  is  clear  that 


(eN  -0)  yn  ynt  = nN  ynt 

A A rn  A rn 

SN  -0  * "N  V <YN  V> 

A 

n(k)  is  not  correlated  with  y(k) 
it  is  easy  to  show  that 

plim  nN  ynt  (Yn  ynt)-x 


-1 


but  is  correlated  with 


= 0 


y(k). 


(3.42) 

(3.43) 


However , 


(3.44) 


if  the  conditions  regarding  the  efficient  instrumental  variables  are  satis- 
fied. The  covariance  of  the  estimation  error  is  difficult  to  determine. 

It  can  be  obtained  by  neglecting  the  correlation  between  nN,  Y^.  and  Y^, 
approximately,  as 


A rp  -1  A A rp 

Di  - Cyn  yn > yn  yn 


Yv,1')"1  (Qii  + Ru  - (<|>R4>T)ii> 


N N 


(3.45) 


where  D.  is  the  covariance  of  the  estimation  error  of  parameters  in  the 
ith  row  of  0.  Using  the  approximation 


A A rp  A rp 

v Y 1 = Y Y 1 
TN  XN  N N 


(3.46) 
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we  get 


= (Yk,  yJ)'1  {Q. . + R. . - (<pR<pT) . . } 
i v N N ’ xn  11  v Y Jn 


C3.47) 


3.2.6  .Application  to  Aircraft  Parameter  Identification 


Decoupled  longitudinal  and  lateral  aircraft  equations  of  motion  are 
given  by  Etkin  [19].  In  the  aircraft  equations,  many  terms  in  the  longi- 
tudinal and  lateral  equations  are  unity,  zero  or  otherwise  known.  This 
fact  can  be  used  to  advantage  in  the  instrumental  variable  approach. 

The  state  equations  for  the  lateral  motions  of  an  aircraft  are 
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dt 
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N N N0  0 
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sin  a -cos  a ^ 
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(3.48) 


Suppose  there  are  noisy  measurements  of  p,  r,  6 and  <j>.  Let  the  sampling 
interval  be  A.  The  discrete  time  equivalent  of  the  above  system  is,  for 
small  A, 


The  approximation  of  Eq.  (3.46)  is  as  good  as  the  estimates  of  the  state 
variables  used  as  instruments.  As  the  number  of  data  points  increase  and 
the  estimates  of  parameters  improve,  Eq.  (3.46)  will  be  a better  approxima- 
tion. If  Eq.  (3.46)  holds,  VXT  is  a good  measure  of  the  parameter  estimation 
errors.  The  diagonal  terms  of  this  matrix,  when  multiplied  by  the  second 
term  on  the  right  hand  side  of  Eq.  (3.47),  provide  covariance  of  estimation 
error  on  each  parameter. 
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p(k+l)  "I  r 1+1  A L A L A 

P r P 

r (k+1)  N A 1+N  A N.A 

= P r B 

B(k+1)  (sin  o)a-(cos  c<)a  1+Y^A 

<J>(k+l)_  L A (tan  q)a  q 
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1 J L<J)(k)J  L 0 0 _ 


Since  the  last  column  of  0 and  last  row  of  4>  and  G do  not  contain 
any  unknown  parameters,  the  above  equations  can  be  simplified 
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Equation  (3.50)  can  be  used  for  identification,  while  Eq.  (3.49)  is  used  for 
prediction  and,  hence,  determining  the  instruments. 


The  four  states  (angle-of -attack  a,  forward  speed  u,  pitch  rate  q, 
and  pitch  angle  9),  which  describe  the  longitudinal  motions  of  an  aircraft 
in  discrete  form  are 
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which  could  be  simplified  as  before 
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Examples  of  the  application  to  flight  test  data  of  this  algorithm 
are  detailed  in  Chapter  IV. 


3.3  ON-LINE  MAXIMUM  LIKELIHOOD  METHODS 


The  likelihood  method  has  been  a subject  of  considerable  study  in 
the  parameter  identification  work  [13].  The  method  has  been  very  attractive 
because  the  resulting  estimates  possess  desirable  statistical  properties. 

It  can  be  shown  that  the  likelihood  estimates  are  asymptotically  efficient, 
unbiased,  and  consistent.  This  technique  converts  the  identification 
problem  into  an  optimization  problem  of  the  likelihood  function.  The  like- 
lihood method  is  basically  a batch  technique  requiring  several  passes 
through  the  data  because  the  likelihood  function  is  nonlinear  in  parameters. 
Therefore,  it  is  usually  implemented  as  an  off-line  technique.  The  maximum 


likelihood  method  has,  however,  been  recently  modified  to  make  it  an 
effective  tool  for  on-line  and  real  time  identification  purposes.  The  tech- 
niques will  try  to  integrate  the  requirements  of  efficiency  and  convergence 
and  low  computation  time  and  storage. 

3.3.1  Off-Line  Maximum  Likelihood  Method  With  Newton- Raphson  Optimization 


Consider  a discrete  time  system 


x(k+l)  = $x(k)  + Gu(k)  + fw(k) 


(3.53) 


with  measurements 


y (k)  = Hx(k)  + v(k) 


i=l,2,3,...N 


(3.54) 


The  negative  log -likelihood  function  is 


I ^ T1  _ 1 

JN  = 7 Z {v*(k)  B 1 v(k)  + log  | B | > 
N z k=l 


(3.55) 


where  v(k)  are  the  innovations  and  B is  the  covariance  of  the  innova- 
tions (see  Eqs.  (3.5)  and  (3.6)).  We  assume  here  that  B is  a constant. 
A direct  optimization  with  respect  to  B gives 


1 T 

B = w E v(k)  v1  (k) 

N k=l 


(3.56) 


Substituting  R in  the  negative  log-likelihood  function 


= 


2 log  | B j + constant 


(3.57) 


The  first  derivative  of  the  cost  function  is 
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(3.58) 


and  the  second  derivative  (which  is  the  information  matrix)  is, 
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(3.59) 


In  the  Newton- Raphson  procedure,  the  parameter  step  is  determined  by  solv- 
ing the  linear  equations 


9J 


^ A0n 
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90 


(3.60) 


After  updating,  the  new  parameters  are  now  used  to  recompute  the  first 
and  the  second  gradients  of  the  negative  log- likelihood  function, 
which  determines  the  next  parameter  step.  This  procedure  is  repeated  until 
the  gradient  is  insignificant,  the  cost  stops  decreasing  and  A0  becomes 
small. 


9v(k) 


Both  the  first  and  the  second  gradients  of  are  functions  of  ^ 
These  gradients  are  computed  by  differentiating  Eqs.  (3.5)  and  (3.6)  with 
respect  to  each  parameter 
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The  gradient  of  Kalman  gain  with  respect  to  6^  is  determined  from  Fq.  (3.6) 
If  there  are  m parameters,  the  computation  of  the  first  and  second 
gradients  of  the  cost  functions  requires  the  solution  to  (m+l)n  difference 
equations.  A flow  chart  of  the  algorithm  is  shown  in  Figure  2. 


3.3.2  Modification  of  the  Off-Line  Likelihood  Method  for  On-Line  and 


Real  Time  Applications 


There  are  two  basic  problems  with  the  off-line  maximum  likelihood 
method  for  on-line  and  real  time  applications.  These  are:  (a)  too  much 

computation  time  is  required  in  solving  the  sensitivity  equations,  and 
(b)  the  entire  data  record  must  be  stored  because  several  passes  are 
required  through  the  data  record.  Twd  major  modifications  are  made  to 
overcome  these  problems.  First,  gradient  computation  must  be  simplified 
and  secondly,  more  rapid  techniques  for  parameter  update  must  be 
implemented.  Gradient  computations  may  be  simplified  by  using  the 
requirement  that  the  system  is  linear.  The  computation  of  the  first 
and  second  gradients  of  the  cost  function  then  does  not  require  the 
explicit  computation  of  the  sensitivities  of  the  innovations  with 
respect  to  the  parameters.  This  result  is  shown  as  follows.  The 
innovation  gradient  with  respect  to  all  parameters  can  be  written  in 
matrix  form 


FLIGHT  TEST  DATA  , WIND  TUNNEL  VALUES  OF 
AERODYNAMIC  PARAMETERS 


j J = £{vTB" 1v+log | B | } 


30‘  e*  0 


,-l  9 J 
1 le 


Figure  2 Flow  Chart  of  Maximum  Likelihood  Identification  Program 
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(3.68) 


Equation  (3.63)  is  the  principal  reason  for  high  computation  time 
using  the  maximum  likelihood  technique.  Figure  3 simply  illustrates  the 
increase  in  the  number  of  states  plus  sensitivity  equations  as  the  number 
of  parameters,  m,  is  increased  (see  Figure  1 to  estimate  execution  time). 
It  is  shown  in  Appendix  A,  however,  that  this  system  of  equations  can 
be  reduced  to  a lower  order  system,  specifically 


xA(k+l)  = <f>AxA(k)  + Ga  u(k) 
Xq (k)  = TxA(k) 


(3.69) 


The  order  r of  the  system  is  smaller  than  or  equal  to  n(q+l).  Note  that 
Ga J are  in  controller  canonical  form.  Substituting  x^(k)  from 
Eq.  (3.69)  into  F.q.  (3.63),  we  have 


vQ(k)  = yQ(k)  - H T xA(k)  A y (k)  - T xA(k) 


(3.70) 


The  first  p elements  of  vQ  are  the  innovations  v,  the  next  p elements  are 
the  gradient  of  the  innovations  with  respect  to  the  first  parameter,  and  so 
on.  Let  Tq  be  the  first  p rows  of  T,  T^  the  next  p rows,  and  so  on. 

Then 


9J..  N T i 

^ = E V-Vk)  B'1  v(k) 
"i  k=l  1 


E (-  xAT(k)  T.T)  B'1  (y(k)  - Tq  xA(k)) 
k=l 


(3.71) 
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Figure  3 Number  of  State  Plus  Sensitivity  Equations  to  be  Integrated 
on  Each  Iteration  of  Maximum  Likelihood 
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(3.72) 


Thus,  the  computation  of  the  gradient  and  the  second  gradient  of  the  nega- 
tive log- likelihood  function  requires  the  determination  of  the  autocorrela- 
tion of  xA  and  the  cross-correlation  of  x^  and  y.  This  requires  the 
propagation  of  only  the  x^  equations. 

The  estimate  of  B is 


B = i Z v(k)  vT(k) 
N k=l 


i ( N T N j T -1 

i Tr  Z y(k)  yT(k)  + £ xA(k)  xA  (k)  T B T 

N ( k=l  k=l  A A 


- 2 Z yfk)  xAT(k)  TqT  B'1  (3.73) 


Note  that  the  various  matrices  involved  in  the  computation  of  the 
cost  and  the  first  and  second  gradients  of  the  cost  are  submatrices  of 
T*  defined  as 


?1 11 1 Ti  i T2 ! -N 


T*  can  be  computed  and  stored  before  the  on-line  identification  is  begun. 
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The  second  modification  is  related  to  the  first  and  consists  of 

approximations  to  the  gradients.  In  general,  the  derivatives  of  J„ 

N 

computed  using  the  a priori  values  of  the  derivatives  are  in  error 
since  these  values  are  usually  not  correct.  As  mentioned  before,  the 
maximum  likelihood  optimization  will  seldom  converge  in  one  Newton- 
Raphson  step,  requiring  computation  of  4> A and  T for  parameter  values 
at  every  step. 

Three  techniques  to  reduce  this  problem  are  as  follows. 

1.  It  may  be  possible  to  use  the  same  sensitivity  equations  even 
after  the  parameters  are  updated  until  the  parameter  step  is 
large  enough  to  cause  appreciable  error  in  the  computation  of  the 
gradients.  In  other  words,  the  same  sensitivity  l(iations  are  not 
updated  nearly  as  often  as  the  parameter  step.  This  method  is 
simple  but  is  applicable  only  when  the  initial  parameters  are  fairly 
good  or  the  information  about  the  parameters  is  obtained  at  a low 
rate  (i.e.,  low  system  excitation  or  high  noise-to-signal  ratio). 

In  either  of  these  cases,  the  parameter  step  will  be  small  over 
a short  period  of  time. 

2.  The  gradients  of  ct>^,  and  T with  respect  to  the  parameters 
can  be  computed  and  stored.  Then  <j>^,  and  T can  be  modified 
more  often  than  in  the  last  case,  thus 

m 3<J>. 

V0)  = + if1  397  (0i  - ®P 

(3.75) 
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m 9G « 

Ga(6)  - Ca(9')  ♦ £ ^ (0J  - ep 

etc. 


I 


u 


Since  and  are  in  controller  canonical  form,  only  the 
canonical  variables  need  to  be  updated.  The  sensitivities 
of  only  these  canonical  variables  need  to  be  precomputed  and 
stored.  Notice  that  this  does  not  require  much  computation 
in  real  time. 

3.  The  state  xA  for  different  parameter  values  can  be  computed 
directly 


xA(k,0)  = xA(k,e') 


+ 


m 

I 

i=l 


(Qi 


9P 


(3.76) 


3xa 

is  computed  by  writing  the  sensitivity  equations  for  the 

system  of  equations  governing  xA(k).  The  sensitivity  reduction 
is  again  used  on  this  system  to  minimize  the  number  of  equations 
to  be  solved.  Then,  the  xA(k,9)  can  be  computed  directly  once 
9 is  known . 

The  preceding  techniques  are  useful  for  reducing  the  computer 
requirements  of  an  off-line  algorithm.  Results  for  flight  test  data 
are  given  in  Chapter  IV. 

3.3.3  Recursive  Equations  for  Real  Time  Application 

The  determination  of  the  parameter  step  size  requires  the  solution 
of  m simultaneous  linear  equations.  It  would  be  computationally  infeasible 
to  solve  these  equations  at  every  time  step  for  a real  time  algorithm. 

There  are  two  methods  to  obtain  the  parameter  step  with  a reasonable 
computation  time:  (a)  use  recursive  equations  to  update  the  information 

matrix  and  the  parameter  step  without  explicitly  computing  the  inverse 
of  the  information  matrix,  or  (b)  use  some  approximation  technique. 


47 


The  use  of  recursive  equations  is  based  on  the  solution  for  A0 


written  as 


-1  9^N 

A0N  ~ MN  9F~ 


N 


(3.77) 


The  recursive  equations  to  update  and  A6^  can  be  written  as 


M-1  = m'1  - M'1  9vT(Nfl)  Ip  . 9v(N+l)  -1  3vT(N+1) ) 
Vl  n 90  jB  “le  90  j 


-1 


MNdlu-i 

90 


(3.78) 


48n.j  - A6n  * Mjj1  ^V1?  jB  * MSil  M.'1 


-1 


90  ~ ^ 90 

T > 


v(n+i)  -ff  A0N; 


(3.79) 


These  equations  are  useful  only  if  the  number  of  measurements  is  small. 
Otherwise,  the  computation  time  is  too  large. 


Approximation  methods  are  superior  if  the  number  of  measurements  is 
large  and  the  parameter  values  are  not  required  at  every  measurement  point. 
Let  the  parameter  updates  be  required  every  k data  points,  then 


w A„  _ 3JN+k 
NW  A0N+k  90 


(3.80) 


aJN+k  3JN 


Vk  (AVk  - AV  = -55^  * if  - 0*Mc  - V A0N  f3-81) 
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N+k  T 

Vk  <«W  - 4V  ■ ^ B'1  bw  - rP  49n!  <3-8« 


The  second  term  on  the  right  hand  side  represents  the  "true"  innovations 
when  the  parameter  value  is  0Q  + A0^.  The  sum  on  the  right  hand  side  of 
the  equation  can  be  computed  recursively  for  any  k.  can  also  be 

computed  recursively.  To  simplify  the  solution  of  the  above  equation,  it 
is  assumed  that  the  process  is  in  progress  for  a long  time.  Then 


NWk  = ^ 


(3.83) 


Therefore,  the  inverse  of  the  information  matrix  does  not  have  to  be  com- 
puted at  every  point  the  parameters  are  updated.  Note  that  the  sum  on  the 
right  hand  side  of  Eq.  (3.82)  can  be  computed  using  the  sensitivity 
functions  reduction. 


3.3.4  An  On-Line  Maximum  likelihood  Method  for  Nonlinear  Systems 

In  many  applications,  the  computation  time  and  computer  storage  are 
at  a very  high  premium  while  the  accuracy  of  the  estimates  and  the  time 
required  to  get  accurate  estimates  is  a secondary  consideration.  This  may 
be  particularly  true  where  the  system  is  defined  by  a nonlinear  system  of 
equations  then  the  instrumental  variable  methods  are  not  applicable  and  it  is 
not  possible  to  use  reduced  sensitivity  equations.  Then  the  likelihood 
method  can  be  used  without  computing  the  gradients. 


Consider  the  system  following  the  equations  of  motion 


x(k+l)  = (x (k) , u(k),  0)  + w(k) 


(3.84) 


with  measurements  of  nonlinear  combinations  of  states  and  inputs 


z(k)  = h(x(k) , u(k) , 6)  + v(k) 


where  w(k)  and  v(k)  are  white  noise  sources  with  covariances  Q and  R, 
respectively,  and  cross -covariance  zero  and  0 are  the  vectors  of  unknown 
parameters. 
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The  negative  log- 1 ikel ihood  function  can  be  written  in  terms  of 
the  parameters 

K4  if. 


J(6)  = \ £ (v7(k)  B';l  v(k)  + log  |B|) 

k=l 


(3.8b) 


The  problem  is  to  find  a 0 which  minimizes  J(6).  Instead  of  using  a 
gradient  method,  a direct  search  technique  will  be  used.  See  Powell  [20] 
and  Stewart  III  [21]. 

The  underlying  principle  of  all  these  direct  search  methods  is  basi- 
cally the  same.  A certain  value  is  chosen  for  0 and  the  likelihood  function 
is  computed  for  this  value  of  0.  The  likelihood  function  is  computed  for 
another  value  of  0.  These  two  likelihood  functions  are  compared  and  are 
used  to  determine  a value  of  the  parameter  vector  at  which  it  would  be  most 
useful  to  compute  the  likelihood  function.  This  process  is  repeated  until 
a satisfactory  convergence  results. 

The  on-line  identification  requirements  make  a direct  application  of 
this  optimization  technique  infeasible  since  it  would  require  storing 
the  entire  data  and  propagating  the  equations  of  motion  through  this  data 
again  and  again  for  different  parameter  values.  One  method  which  seems  very 
promising  is  to  determine  the  likelihood  function  for  any  parameter  vector 
using  only  s points.  In  short,  the  likelihood  function  is  determined  for 
different  parameter  vectors  using  different  sections  of  the  data.  This  would 
also  obviate  the  problem  of  storing  the  data. 


Whittle  [22]  has  shown  that  the  limiting  distribution  of  twice  the  dif- 
ference of  the  negative  log -likelihood  functions  for  any  parameter  value  to 
its  expectation  for  true  parameter  values  can  be  approximated  by  a chi-square 
distribution  with  one  degree -of -freedom.  Thus,  the  smaller  the  s,  the 
higher  the  variance  of  the  likelihood  function  estimate  compared  to  its  value. 
In  the  beginning,  therefore,  s can  be  chosen  small  since  we  are  looking  for 


gross  changes  in  the  value  of  the  likelihood  function.  However,  as  conver- 
gence is  approached,  s should  be  increased  so  that  the  difference  in  the 
value  of  the  likelihood  function  is  not  lost  in  the  estimation  errors.  Large 
s would  also  reduce  the  errors  caused  in  the  comp  ^ison  of  the  likelihood 
functions  based  on  different  input  responses. 

Several  important  points  are  in  order  here.  If  more  computer  time 
is  available,  the  likelihood  function  can  be  evaluated  for  several  parameter 
vectors  simultaneously.  This  would  reduce  the  amount  of  data  required  to 
get  accurate  parameter  estimates.  The  optimization  routine  will  have  to 
be  modified  so  that  it  selects  several  points  in  the  parameter  space  where 
it  is  most  useful  to  determine  the  value  of  the  likelihood  function.  This 
feature  makes  the  method  very  attractive  for  real  time  application  because 
the  identification  routine  can  adapt  to  the  available  computer  time.  If 
the  computation  load  increases,  the  technique  could  star!  evaluating 
likelihood  functions  for  one  or  even  no  parameter  value. 

This  technique,  essentially,  treats  each  parameter  value  as  a separate 
model  and  chooses  the  model  for  which  the  negative  log- likelihood  function 
is  minimum.  With  some  modification,  it  can  be  applied  when  the  parameter 
space  is  not  compact,  in  particular,  for  two  models  with  different  dimen- 
sion and  parameterization. 

3.4  COMPARISON  OF  1HE  INSTRUMENTAL  VARIABLES  METHOD  AND  THE  ON-LINE 

MAXIMUM  LIKELIHOOD  .APPROACH 

In  Chapter  II,  it  was  shown  that  under  certain  circumstances,  the  least 
squares  and  the  maximum  likelihood  methods  belong  to  the  same  general  class. 
The  same  relationship  holds  between  the  instrumental  variables  method 
(least  squares  or  equation  error  type  method),  and  the  on-line  maximum 
likelihood  method.  The  instrumental  variables  method  differs  from  the 
least  squares  in  that  the  instrumental  variables  are  chosen  such  that  the 
resulting  estimates  are  biased.  The  on-line  maximum  likelihood  method  is 
a simplified  form  of  the  off-line  maximum  likelihood  method  involving  no 
storage  of  past  input/output  data  and  simplified  computation  of  the  first 
and  the  second  gradients  of  the  negative  log-likelihood  function. 
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From  the  application  viewpoint,  there  are  several  differences  in  the 
two  approaches.  These  differences,  which  dictate  where  either  of  these 
techniques  should  be  used,  are  summarized  here: 

1.  The  on-line  maximum  likelihood  method  is  more  general  and  sta- 
tistically more  efficient  than  the  instrumental  variables 
approach. 

2.  The  convergence  of  the  instrumental  variables  approach  of 
Section  3.2  is  assured,  while  the  maximum  likelihood  method  does 
not  have  guaranteed  convergence.  In  other  words,  the  instru- 
mental variables  method  does  not  require  good  starting  values  of 
parameters,  unlike  the  on-line  maximum  likelihood  method. 

3.  The  on-line  maximum  likelihood  method  requires  much  more  computer 
time  and  storage  than  the  instrumental  variables  approach. 

4.  The  maximum  likelihood  method  can  be  applied  in  a variety  of 
circumstances  as  long  as  the  parameters  are  identifiable.  More- 
over, the  maximum  likelihood  method  is  the  only  method  which  can 
be  used  with  nonlinear  systems. 


5.  The  single  step  of  the  Newton- Raphson  procedure  in  maximizing  the 
negative  log -likelihood  function  may  be  inadequate  if  the  para- 
meters are  far  from  the  true  values. 

3.5  SUMMARY 

This  chapter  presented  two  techniques  for  on-line  identification  of  para- 
meters of  state  variable  models.  The  first  method,  the  instrumental  variables 
approach  (IVA),  can  be  used  with  linear  systems,  is  extremely  quick  and  has 
guaranteed  convergence  irrespective  of  starting  parameter  values  and  changes 
in  parameter  values.  The  second  method  uses  the  maximum  likelihood  approach 
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and  can  be  applied  to  both  linear  and  nonlinear  systems.  For  linear  systems, 
the  computation  time  is  reduced  considerably  by  the  sensitivity  functions 
reduction  technique.  This  method  also  requires  good  starting  values  for  the 
p^  rameters. 

The  instrumental  variables  and  the  on-line  maximum  likelihood  approaches 
are  two  complementary  methods  for  on-line  and  real  time  estimation  of  para- 
meters in  dynamic  mdfills. 

The  next  chapter  presents  some  results  on  the  application  of  these 
methods  to  simulation  and  flight  test  data. 


IV.  APPLICATION  OF  ON-LINE  ALGORITHMS  TO 
SIMULATED  AND  FLIGHT  TEST  DATA 


4.1  INTRODUCTION 

In  the  previous  chapter,  two  on-line  and  real  time  parameter  identi- 
fication algorithms  were  developed.  Both  of  those  algorithms  have  excellent 
statistical  properties.  It  was  also  shown  that  the  first  method,  the 
instrumental  variables  approach  (IVA)  has  guaranteed  convergence  irrespective 
of  the  starting  parameter  values  and  the  measurement  noise.  The  on-line 
maximum  likelihood  (OLML)  method  requires  a higher  computation  time  and 
good  starting  parameter  values  but  produces  estimates  which  are  statistically 
somewhat  superior. 


This  section  details  the  ’"".lamentation  algorithm  for  the  instrumental 
variables  approach.  The  computer  program  based  on  this  algorithm  is  used 
both  with  simulation  data  (which  includes  process  noise  and  measurement 
noise  effects)  and  with  actual  flight  test  data  for  various  aircraft. 

In  the  simulation  data,  parameters  are  changed  in  the  middle  of  the 
simulation  to  determine  the  speed  with  which  new  parameter  estimates  are 
obtained.  The  maximum  likelihood  method  is  not  implemented  in  a separate 
on-line  computer  program  because  the  characteristics  of  the  likelihood 
method  are  well  known.  A study  on  the  computation  time  requirements  of 
the  On-Line  Maximum  Likelihood  method  is  presented. 


Section  4.2  gives  the  IVA  algorithm,  its  flow  chart  and  computation 
and  storage  requirements.  Simulation  results  are  presented  in  Section  4.3. 
The  parameter  estimates  from  flight  test  data  are  shown  in  Section  4.4. 
Section  4.5  discusses  the  implementation  aspects  of  the  real  time  and 
on-line  maximum  likelihood. 
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4.2  THE  IVA  ALGORITHM 

Table  2 shows  the  initial  computations  required  to  start  the  IVA 
on-line  algorithm  and  the  inputs.  The  actual  algorithm  for  each  new 
sampling  point  is  given  in  Table  3.  The  number  of  computations  and 
storage  locations  required  are  summarized  in  Table  4. 

The  three  tables  are  detailed  and  self-explanatory.  Some  aspects 
of  the  algorithm  need  further  comment. 

State  and  Covariance  Update:  In  the  impl mentation  of  Table  3, 

the  state  vector  is  estimated  by  using  a simple  Kalman  filter.  It 
could  be  modified  to  include  an  adaptive  Kalman  filter  or  a parameter 
insensitive  filter.  These  could  result  in  some  improvement.  Another 
technique  is  to  update  the  Riccati  gain  matrix  (state  estimation  error 
covariance)  based  on  the  identified  parameter  values  and  a priofi  or 
identified  process  and  measurement  noise  covariance.  The  covariance 
update  is  not  recommended  because  it  could  cause  problems  when  the 
parameter  estimates  are  far  from  the  true  values  and  it  also  requires 
considerable  additional  computation  time.  It  may  be  feasible  to  update 
it  at  regular  intervals. 

Choice  of  a:  The  a time  history  is  chosen  a priori.  Usually,  a 

should  be  chosen  close  to  one  in  the  beginning  when  the  parameter  values 
are  not  known  at  all.  The  a is  gradually  decreased  until  it  reaches 
a value  close  to  zero  (about  .1  is  sufficient).  This  low  value  should 
be  maintained,  thereafter.  This  ensures  good  convergence  in  the  beginning 
when  the  filter  is  operating  poorly.  If  parameters  of  the  system  change 
in  the  steady  state  operation,  the  method  is  guaranteed  to  converge  to 
new  parameter  values. 
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Table  2 

Inputs  and  Initial  Computation  for  IVA 


Initial  Computation 


K = KH  (Optional) 


T requires  storage  of  p x n 


INPUT 

SYMBOL 

STORAGE 

1.  Initial  Estimate  of  State 

x(l) 

n 

2.  Covariance  of  State  Estimate 

P(l) 

2 

n 

3.  Kalman  Gain  (Optional) 

K(l) 

nm 

4.  Process  Noise  Covariance  (Optional) 

Q 

2 

n 

5.  Measurement  Noise  Covariance 

R 

2 

ID 

6.  State  Transformation  Matrix 

♦l 

2 

n 

7o  Control  Distribution  Matrix 

G1 

nq 

8o  Measurement  Distribution  Matrix 

H 

mn 

T -1 

9.  (Y  Y j 

o o 

V 

(n+q)2 

10.  Past  Fading 

P 

1 

IVA  Algorithm 


XI.  UPDATE  STATE 


3 (Concluded) 


Table  4 

Number  of  Computations  and  Storage  for  Each  Step  of  the  IVA 


n = no.  of  states,  p = no.  of  measurements,  q = no.  of  inputs 


| 

Updating  4>  and  G:  Sometimes  the  parameters  0 change  wildly, 

particularly  in  the  beginning  of  a run  or  when  parameters  change.  The 
matrices  4>  and  G also  change  rapidly  causing  problems  in  determining 
the  instruments.  Therefore,  it  is  often  preferable  not  to  update  <j>  and 
G after  each  sample  point.  This  produces  a certain  "damping"  in  the 
parameter  identification  which  is  good  for  convergence. 

Choice  of  the  Past  Fading  Factor,  p:  The  past  fading  factor  should 
be  chosen  close  to  one.  If  it  is  too  close  to  one,  the  parameter  estimates 
are  more  accurate  when  parameter  values  are  constant  but  are  very  sluggish 
to  changing  parameter  values.  If  p is  too  far  from  one,  the  converse 
holds.  In  the  fault  detection  situation  (possible  change  in  parameter 
value),  the  factor  p is  associated  with  the  problem  of  false  alarm  and 
delayed  alarm.  In  the  data  processing  case,  p should  equal  one. 


4.3  APPLICATION  OF  IVA  TO  SIMULATION  DATA 


The  instrumental  variables  method  has  been  applied  to  the  DC- 8 and 
F-14  simulation  data.  The  equations  of  motion  are  summarized  in  Section 
3.2.6  for  both  the  longitudinal  and  the  lateral  cases. 

4.3.1  DC- 8 Simulated  Lateral  Motions 

The  state  equations  for  the  lateral  motions  of  a DC- 8 are  given  by 
Equation  (3.48).  For  the  purpose  of  identification,  it  is  assumed  that 
there  are  noisy  measurements  of  p,  r,  8 and  $.  The  measurement  noise 
covariance  matrix  is 

R = diag[2. 5 x 10'5,  2.5  x 10~5,  2.5  x 10'5,  2.5  x 10~5]  (4.1) 

The  sampling  interval  A is  .05  sec.  The  true  values  of  the  parameters 
and  inputs  used  in  the  simulation  are  given  in  Table  5.  The  variation 
of  a with  time  is  shown  in  Figure  4.  The  parameters  are  identified 
using  the  instrumental  variables  method.  The  parameter  values  at  the 
end  of  2.5  sec.,  5 sec.,  7.5  sec.,  and  10  sec.  are  shown  in  the  table. 

Parameters  Lp,  Lr,  Lg,  L6a,  N , Nr>  Ng,  N6r  and  Yg  are  close  to  their 
true  values  while  parameters  L$r,  N{a,  Y§a,  Y5r  are  far  from  their  true 
values.  This  is  the  identifiability  problem.  The  instrumental  variables 
method  does  not  check  if  any  parameter  is  identifiable  or  not.  It  simply 
produces  incorrect  estimates  for  unidentifiable  or  poorly  identifiable 
parameters. 

4.3.2  Short  Period  Motions  of  an  F-14  Aircraft 


The  short  period  motions  of  an  F-14  aircraft  in  wind  gust  disturbance 
are  simulated  using  the  following  equations  of  motion  for  a sampling 
interval  of  .05  sec.  (units  ft.,  deg.,  sec.). 


Table  5 

DC- 8 Simulation  Data  (Lateral  Mode) 


PARAMETER  TRUE  VALUE 


-2.029 

.4697 

-2.092 

22.02 

.4807 

-.05828 

-.2724 

1.655 

-.1709 

-1.36  80 

-.1274 

0 


1VA  ON-LINE  VALUE 

AFTER: 

0 SEC. 

2.5  SEC. 

5 SEC. 

7.5  SEC. 

10  SEC. 

-12.84 

13.96 

-2.0 

-2.106 

-2.106 

-2.7 

-.956 

2.32 

.511 

.441 

1.628 

-10.98 

-2.0 

-2.076 

-1.88 

-1.83 

-1.82 

20.0 

22.4 

22.39 

-.260 

1.4 

1.77 

.426 

.258 

-7.77 

-12.0 

,032 

-.0376 

-.0508 

-17.54 

-2.16 

-1.09 

-.228 

-.3106 

1.36 

-8.05 

1.65 

1.632 

1.716 

1.0534 

1.052 

-1.93 

-.348 

-.28 

-.0071 

-.48 

-.272 

-1.26 

-1.38 

-20.17 

-2.9 

.054 

.132 

-.1288 

.2314 

.23 

.95 

-1.31 

-1.274 

.01140 

-.74 

-1.08 

.196 

.232 

time  (sec) 


yy time  (sec) 


RUDDER  AILERON 

Figure  of  Rudder  and  Aileron  Inputs  for  DC-8  Simulation 


Figure  4 Variation  of  a With  Time 


where  a is  the  angle-of-attack,  q is  the  pitch  angle,  6e  is  the 
elevator  deflection  and  u is  random  wind  speed.  The  random  wind  is 
assumed  to  be  exponentially  correlated  with  RMS  value  of  30  ft.  sec.  ^ 
and  a correlation  time  of  1.1  sec.  There  are  noisy  measurements  of  a 
and  q.  The  following  cases  are  tried: 


Parameters  constant,  and 


The  results  for  a 40  sec.  long  experiment  with  zero 
and  optimal  K are  shown  in  Table  6. 


1547  -.0783  -.1547  -.1547  -.1547  -.1547  -.1547  -.1547  -.1547  -.1547 


(2)  Effect  of  K:  For  a 40  sec.  long  experiment,  the  measurement 

noise  covariance  matrix  is  taken  as 


and  all  the  parameters  are  doubled  instantly  at  20  sec 
Four  cases  are  tried: 


(b)  Optimal  K 

(c)  Nonoptimal  K 

(d)  Updating  K at  each  iteration  but  delaying  it 
for  .50  sec.  before  use. 


The  past  fading  factor  is  .99  is  all  these  experiments.  The 
results  are  given  in  Table  7 and  Figure  5.  It  is  clear  that 
using  the  Kalman  filter  is  a big  improvement,  while  suboptimal 
Kalman  gains  do  not  cause  much  deterioration. 


(3)  Effect  of  p:  To  speed  up  the  convergence  when  parameters 

change,  it  is  necessary  to  reduce  p.  This  reduction  in  p 
increases  parameter  estimation  error  in  the  steady  state. 
The  last  case  with  nonoptimal  K is  tried  with  p = 0.97. 
The  results  are  shown  in  Table  8 and  Figure  6. 


4.4  APPLICATION  TO  FLIGHT  TEST  DATA 


The  instrumental  variables  program  has  been  used  extensively  with  flight 
test  data.  To  date,  lateral  flight  test  data  of  an  advanced  Navy  variable 
swept  wing  fighter  and  the  longitudinal  flight  data  of  the  T-2B  Navy  trainer 
and  an  advanced  Navy  fighter  have  been  processed  using  IVA  for  the  purpose 
of  estimating  stability  and  control  coefficients.  The  program  has  been  very 
successful  in  each  of  these  cases.  One  example  is  given  here  to  illustrate 
IVA  application  on  such  flight  data  for  the  swept  wing  fighter. 


IVA  For  Simulated  Short  Period  Motions  of  an  F-14  Aircraft 
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Figure  5 Effect  of  Kalman  Gain  K on  IVA  Estimates 
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The  lateral  data  for  an  advanced  swept  wing  Navy  fighter  for  a differential 
horizontal  stabilizer  input  followed  by  a rudder  input  was  obtained  by  the 
Naval  Air  Test  Center.  The  input  is  shown  in  Figure  7.  There  is  a fairly  good 
excitation  of  all  lateral  modes.  The  sampling  rate  is  0.05  sec. 


The  lateral  equations  for  an  aircraft  are  given  in  Section  3.3.6.  The 
program  is  started  with  zero  values  for  all  parameters.  The  measurements  of 
roll  rate,  yaw  rate,  sideslip  angle  and  roll  angle  are  used  in  the  identifica 
tion. 


The  parameter  values  are  shown  at  the  end  of  2,  4,  6,  8,  and  10  sec  and 
are  compared  wi'h  maximum  likelihood  estimates  (in  which  a^.  is  also  used).  All 
parameters  are  fairly  close  except  Yg.  It  is  possible  to  get  a better  estimate 
of  Yg  using  maximum  likelihood  because  of  the  lateral  acceleration.  The 
estimates  of  and  are  poor  because  these  parameters  are  only  poorly 

identifiable.  r a 


4.5  IMPLEMENTATION  OF  THE  ON-LINE  MAXIMUM  LIKELIHOOD  METHOD 


The  on-line  maximum  likelihood  method  is  implemented  in  the  simplest 
form.  The  basic  idea  in  the  implementation  was  to  determine  the  computation 
time.  The  approximate  expression  (3.75)  was  not  used  to  update  the 
sensitivity  transition  matrices.  The  program  is  written  in  FORTRAN  and 
used  on  a UNIVAC  1108  computer.  For  this  reason,  it  is  believed  that  the 
computation  time  reported  here  is  higher  by  a factor  of  2 and  up  to  a 
factor  of  5. 


The  on-line  maximum  likelihood  program  is  illustrated  to  determine  longi- 
tudinal stability  and  control  coefficients  from  the  responses  of  a T-2B  aircraft 
The  sampling  rate  is  20  per  second.  Full  four  state  model  is  used  with  two 
inputs  and  five  measurements.  Including  initial  conditions,  instrument 
biases  and  random  errors,  there  are  seventeen  parameters  to  be  identified. 

The  identification  is  started  from  wind  tunnel  values.  For  a 20  sec.  long 


IVA  Estimates  for  a Swept  Wing  Navy  Fighter 
(Dimensional  Derivatives) 


Not  Identifiable 


run,  (400  data  points),  the  per  iteration  time  is  4 seconds.  The  program 
converges  in  5 iterations  for  a total  CPU  time  of  20  seconds.  The  parameter 
values  are  shown  in  Table  10  and  the  time  history  plots  in  Figure  8. 

Note  that  this  has  been  achieved  without  any  loss  in  accuracy  of  estimates. 

Table  10 

Identified  Derivatives  for  1-2B  Aircraft 
Speed:  679  ft/ sec 
Altitude:  10,000  ft 


PARAMETER 

z0 

- 2.621 
(0.02S3) 

Z 

ft 

U 

z„ 

1.0 

q 

Xa 

ft 

X 

ft 

a 

X 

ft 

q 

Ma 

-25.29 

(2.28  x 10  *) 

M 

u 

0.0017 

M 

q 

- 4.6S6  p 

(9. 21  x 10  *) 

z«e 

0.1917 

(0.0154] 

X«e 

ft 

Mj  - 

-25.51 

(0.114) 

EQUATION  BIAS 

Z0 

Mo 

MEASUREMENT  BIAS 

bo 

rad 

bu 

ft  sec  * 

b9 

rad  sec  1 

be 

rad 

b4. 

ft  sec'2 

RANDOM  NOISE 

STANDARD  DEVIATIONS 

o 

a 

rad 

°u 

ft  sec'1 

°q 

rad  sec1 

°8 

rad 

ft  sec'1 

-0.00216 

0.S80 


0.00234 

-3.3 

0.00113 

0.00146 


0.00S4 

3.17 

0.0287 

0.0198 

9.35 
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Figure  8 Identification  of  T-2  Derivatives  Using  the  On-Line  Maximum  Likelihood  Program 


RUN  19  OUTPUT  ET.ROR 


Figure  8 (Concluded)  Identification  of  T-2  Derivatives  Using  the  On-Line  Maximum  Likelihood  Program 


COMPUTER  TIME  (MINUTES) 


Figure  9 shows  the  computation  time  required  to  implement  the  on-line  maxi- 
mum likelihood  program  on  a UNIVAC  1108  for  different  numbers  of  identified 
parameters  and  data  points. 


Use  of  more  optimal  computer  instruction  organization  and  implementation  of 
the  simplification  given  in  the  previous  chapter  would  certainly  produce  a viable 
real  time  maximum  likelihood  program.  The  work  on  this  extremely  fast  maximum 
likelihood  method  should  be  based  on  the  formulae  of  the  previous  chapter. 


NUMBER  OF  PARAMETERS  IDENTIFIED 


Figure  9 Computer  Execution  Time  Versus  Number  of  Parameters 
Identified  for  SCIDNT  with  Sensitivity  Function 
Reduction 
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4.6  SUNMARY 


The  results  presented  in  this  chapter  clearly  demonstrate  that  the  instru- 
mental variables  and  on-line  maximum  likelihood  approaches  are  two  viable  tech- 
niques for  on-line  and  real  time  identification  of  parameters.  By  fine  tuning 
the  algorithms  to  the  specific  problem  and  the  computer,  the  computation  time 
and  storage  could  be  reduced  to  meet  the  constraints  of  the  real  time  application. 


V.  ON-LINE  EVALUTION  OF  DATA  QUALITY  FOR  IDENTIFICATION 
5.1  INTRODUCTION 

A considerable  amount  of  flight  test  data  may  have  to  be 
discarded  because  of  its  inability  to  identify  stability  and  control 
derivatives  to  the  desired  accuracy.  This  happens  when  instruments 
deteriorate  or  fail,  the  input  signal  is  unsatisfactory,  or  there  is 
intermittent  loss  of  the  telemetry  link.  Often  the  flight  tests 
have  to  be  repeated  to  obtain  this  data  again  causing  additional  delay 
and  increased  cost.  Some  flight  testing  facilities  around  the  country 
have  aquired  modem  equipment  to  monitor  the  flight  test  in  real  time. 

An  example  of  such  equipment  is  the  Real  Time  Processing  System  (RTPS) 
installed  by  the  Navy  at  the  Naval  Air  Test  Center,  (NATC) , Patuxent 
River,  Maryland. 

Real  time  flight  monitoring  systems  like  the  RTPS  have  vastly 
increased  the  capabilities  for  quick  evaluation  of  flight  tests. 

This  could  have  a tremendous  payoff  because  if  a certain  maneuver  is 
unlikely  to  give  reliable  results,  that  maneuver  could  be  repeated 
immediately.  This  would  obviate  the  need  for  waiting  for  future 
flight  tests  to  repeat  this  maneuver.  Secondly,  most  of  the  "bad"  data 
will  be  isolated  before  it  is  passed  through  the  more  expensive  model 
structure  determination  and  parameter  identification  algorithms.  Thus, 
to  make  full  use  of  this  capability,  there  is  need  for  on-line  evaluation 
of  data  quality. 

On-line  and  real  time  data  quality  analysis  is  a broad  subject  and 
may  be  called  upon  to  answer  many  different  questions.  In  this  chapter, 
attention  will  be  focused  on  how  to  determine  if  certain  data  is  useful 
for  the  purpose  of  system  identification.  The  next  section  will  discuss 
the  various  requirements  for  a viable  real  time  data  quality  analysis 
algorithm.  The  techniques  for  determining  instrument  deteriorations 
and  failures  is  given  in  Section  5.3.  Section  5.4  deals  with  the 

75 


determination  of  the  effect  of  deteriorated  or  failed  instruments  on 
parameter  identifiability.  The  case  of  poor  system  excitation  is 
discussed  in  Section  5.5.  The  chapter  concludes  with  a discussion  of 
useful  criterion  for  evaluating  data  quality  in  Section  5.6. 


5.2  REQUIREMENTS  FOR  ON-LINE  EVALUATION  OF  DATA  QUALITY 

There  are  three  considerations  which  define  the  requirements  for  an 
on-line  and  real  time  data  quality  analyzer.  These  are:  (a)  a priori 
information,  (b)  desired  accuracy  on  data  quality  evaluation,  and 
(c)  implementation  environment. 

The  complexity  of  the  on-line  data  quality  analysis  algorithm 
depends  strongly  on  what  can  be  assumed  known  about  the  system  before 
the  flight  test  is  conducted.  One  of  the  most  important  among  them 
is  the  knowledge  about  the  system  structure  and  approximate  parameter 
values.  If  this  is  unknown,  it  is  essential  to  obtain  approximate 
parameter  estimates  using  techniques  of  Chapters  II  and  III.  Of  course, 
this  would  result  in  a considerable  increase  in  the  computation  burden. 

The  second  factor  is  the  knowledge  of  errors  in  the  instruments  and 
their  failure  and  deterioration  modes.  A priori  information  about  the 
control  surface  actuators,  telemetry  link,  control  and  aircraft  non- 
linearities  are  also  useful. 

The  second  consideration  in  the  real  time  and  on-line  data  quality 
evaluation  is  the  accuracy  with  which  it  must  be  determined.  In  most 
practical  applications,  it  is  not  necessary  to  compute  the  entire 
information  matrix.  Usually,  since  a yes/no  answer  is  all  that  is  desired, 
a measure  of  the  dispersion  matrix  (inverse  of  the  information  matrix 
and  Cramer-Rao  lower  bound  on  parameter  estimation  errors)  is  sufficient. 
Two  suitable  measures  are  the  weighted  trace  and  the  determinant  of 
the  dispersion  matrix,  or  an  upper  bound  on  either  of  these  two  measures. 
The  upper  bounds  would  ensure  that  at  least  a certain  accuracy  can  be 
achieved  in  parameter  estimates. 
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The  implementation  environment  determines  if  the  data  quality 
evaluation  can  be  performed  in  real  time  or  if  it  is  necessary  to  wait 
after  a maneuver  is  completed.  The  structuring  of  the  algorithm  also 
depends  on  the  constraints  of  a particular  situation.  However,  the 
real  time  and  on-line  application  clearly  requires  that  the  guidelines 
for  computer  implementation  of  on-line  identification  methods  detailed 
in  Chapter  II  be  followed  here. 

The  data  quality  will  be  analyzed  in  the  light  of  three  main  causes  of 
unsatisfactory  identification  results  from  a given  data:  (a)  instrument 
deterioration  and  failures,  (b)  insufficient  system  excitation  or  poor 
inputs,  and  (c)  loss  of  communication  link,  i.e.,  missed  sample  points.  The 
techniques  for  detecting  instrument  degradations  and  failures  are  discussed 
first.  The  next  sections  deal  with  algorithms  for  isolating  poor  inputs. 
These  functions  can  be  performed  simultaneously  or  sequentially. 


5.3  DETERMINATION  OF  INSTRUMENT  FAILURES  AND  DEGRADATION 

The  tests  which  can  be  used  to  determine  instrument  accuracy  and 
check  for  their  failures  can  be  divided  into  two  classes:  (a)  those 

which  explicitly  or  implicitly  use  the  system  model,  and  (b)  those 
which  are  solely  dependent  on  single  instrument  temporal  or  inter- 
instrument consistency  checks.  We  discuss  two  tests  from  the  second 
class  here. 


5.3.1  Temporal  Correlations 


Let  the  system  equations  be 

x(n+l)  = <t>x(n)  + Gu(n)  + Twfn) 


(5.1) 


m 1 there  are  noisy  measurements  of  all  state  variables. 


y(n)  = x(n)  + v(n) 


(5.2) 
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A recursive  equation  can  be  written  for  the  measurements 


y(n+l)  = 4>y(n)  + Gu(n)  + v(n+l)  - <j>v(n)  + rw(n) 


Let  4>n  and  Gn  be  the  best  estimate  of  <J>  and  G 


y(n+l)  - 4>ny(n)  - Gnu(n)  = (<|>-<f>n)  y(n)  + (G-Gn)  u(n) 


+ v(n+l)  - c|>v(n)  + fw(n) 


Define 


z(n)  A y(n+l)  - <fcny(n)  - Gnu(n) 


For  high  sampling  rate,  C4>~<J>n)  and  (G-G  ) are  small  compared  to  the  noise 
terms  even  when  the  parameters  are  far  from  true  values.  The  autocorrelation 
of  z(i)  becomes 


Z(0)  = E(z(i)  zT(i)}  = R + <t»R4>x  + TQr 


T . ™nT 


Z(l)  = E{z(i+1)  z1  (i)}  = -<j>R 


If  the  sampling  rate  is  very  high, 
to  give 


4>  can  be  approximated  by  identity 


Z(0)  = 2R  + TQr 


Z(l)  = -R 


z(n)  = y(n+l)  - y(n)  - Gnu(n) 


Estimates  of  Z(0)  and  Z(l)  are  obtained  from  the  measurements 


:(0)  = Z z(i)  zT(i) 


:(D  - 2 z(i+i)  2 1 (i) 

i=l 


The  reason  we  can  replace  ensemble  average  by  time  average  is  that  for 
high  sampling  rate,  z is  a stationary  process.  Z(l)  can  now  be 
used  as  a simple  check  for  the  instrument  accuracy  and  correct  functioning. 
If 


H0: 

Hl:  ' 

the  hypothesis  is  tested  against  the  alternate  hypothesis 

A low  Z..  could  result  from  dead  instrument,  base  connection,  incorrect 
JJ 

gain,  etc.  A high  would  signify  instrument  failure  and  degradation 

and  noisy  channel. 


Notice  that  this  technique,  though  simple,  can  detect  only  few  of 
the  many  faults  that  can  occur.  Its  effectiveness  in  isolating  the 
failure  would  often  be  limited. 

5.3.2  Inter- Instrument  Comparisons 


In  the  absence  of  good  system  models,  this  method  can  be  applied  only 
if  two  or  more  instrument  outputs  are  related  by  kinematic  equations. 

An  example  is  the  measurements  of  pitch  angle,  pitch  rate  and  pitch 
acceleration.  The  technique  is  illustrated  by  the  following  example. 
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The  equations  governing  9,  q and  q are  approximately 


9(n+l)  = 0(n)  + Aq(n) 


q(n+l)  = q(n)  + Aq(n) 


Since  there  are  discrete  noisy  measurements  y^,  y2  and  y3  of  0,  q,  and  q, 
we  get 


/0(n+l)\  /I  AW0(n)\  /0\ 

\q(n+l)/  \ 0 1/  \q(n)  / \ A / 


y3(n)  - / 0\  v3 


/ylW\  , ('  °\  /8(n)\  t/Tl\ 

'0  1/  \q(n)/  \ v7  / 


\y2(n)/  \0  1/  \q(n)/  \ v2  / 

y3  is,  in  this  formulation,  the  input  signal  for  the  0,  q system.  Since 
all  state  definition  matrices  are  known  a Kalman  filter  (adaptive  or 
nonadaptive)  can  be  propagated  in  real  time.  The  innovations  covariance 
is  related  to  the  covariance  of  the  random  noises  Vp  v2  and  v,.  The 
instrument  failures  can  be  detected  quickly.  This  method  has  been  used 
for  data  smoothing  by  Molusis  [231. 

Another  simple  and  effective  procedure  can  be  used  when  the  kinematic 
relations  are  as  simple  as  this  case. 


yjfn+l)  - yjCn)  = Ay2(n)  + ’^(n+l)  - v^n)  - Av2(n)  (i 

y2(n+l)  - y2(n)  = Ay3(n)  + v2(n+l)  - v2(n)  - Av3(n)  (S 

Instrumental  variables  technique  is  used  to  estimate  the  coefficients  of 
y2(n)  in  Equation  (5.18)  and  y3(n)  in  Equation  (5.19).  Let  the  estimated 
values  be  A^  and  A.,,  respectively.  They  are  compared  with  the  true  values 


of  A.  If  this  is  done,  it  is  simple  to  detect  and  isolate  instrument  failures, 
if  only  one  instrument  fails.  The  following  "truth"  table  can  be  used. 


Aj  = A 

Ax  t A 

A2  - A 

All  Instruments  Working 

Pitch  Angle  Incorrect 

A2  t A 

Pitch  Acceleration  Incorrect 

Pitch  Rate  Gyro  Failure 

The  instrumental  variables  method  is  simple  to  use  because  the  model  is 
not  exactly  as  long  as  there  are  no  failures.  Note  that  the  instrument 
noises  can  also  be  determined.  This  procedure  can  be  modified  for  more 
complicated  kinematic  models. 

An  on-line  adaptive  Kalman  filter  can  also  be  used  to  determine 
various  noise  covariances.  This  technique  would  be  useful  in  determining 
instrument  degradations  rather  than  catastrophic  failures. 

5.4  EFFECT  OF  INSTRUMENT  FAILURES  AND  DEGRADATIONS  ON  PARAMETER 
IDENTIFIABILITY 


The  evaluation  of  the  reduction  in  identifiability  of  parameters 
resulting  from  the  failure  or  degradation  of  one  or  more  instruments, 
is  facilitated  by  isolating  the  contribution  of  each  instrument  towards 
the  estimation  accuracy  of  the  parameter  values.  Therefore,  the 
information  matrix,  a measure  of  the  information  about  parameter  values, 
is  decomposed  for  various  instruments. 

Consider  a continuous  time  representation  of  a linear  time  varying 
system 


with  measurements 


where  x is  a n x 1 state  vector,  y is  a p x 1 output  vector  and 
u is  a q x 1 input  vector.  F,  G and  H are  matrices  of  appropriate 
dimensions  and  depend  upon  m unknown  parameters  0.  The  information 
matrix  for  parameters  0,  for  a test  of  length  T,  is 


where  R is  the  power  spectral  density  of  the  measurement  noise  v 
Equation  (5.22)  can  be  written  as 


The  single  subscripts  denote  a row  or  column  of  a matrix  and  the  double 
subscripts  an  element  of  the  matrix 


The  information  matrix  decomposition  of  Equation  (5.25)  is  very  useful. 
The  first  matrix  on  the  right  hand  side  is  the  information  matrix  about 
the  parameters  if  only  the  first  instrument  is  available.  The  second 
matrix  is  the  additional  information  provided  by  the  second  instrument, 


r 


given  that  the  first  instrument  is  already  present.  If  the  measurement 
noise  in  different  channels  is  independent,  this  decomposition  of  the 
information  matrix  simplifies 

M(i,j)  =0  i t j (5.26) 


giving 


M = + l/2)  + Mf3)  + . . . + (5.27) 

Notice  that,  in  this  case,  the  information  supplied  by  each  measurement  is 
independent  of  the  presence  of  other  instruments.  We  consider  this  case 
from  now  on.  The  generalization  to  correlated  measurement  noise  is  obvious. 


The  dispersion  matrix  (inverse  of  the  information  matrix  and 
Cramer-Rao  lower  bound  on  standard  deviations  of  parameter  estimation 
errors)  is 


D = (m«  + + M(3^  . . . M^)"1 

5.4.1  Instrument  Failures 


(5.28) 


If  the  above  formulation  is  used,  the  effect  of  instrument  failures 
on  parameter  estimation  accuracy  can  be  evaluated  conveniently.  The 
increase  in  dispersion  matrix  because  of  failure  of  the  k^1  instrument 
is 

Dp^  - D = (d'1  - M^)'1  - D = D (i  - M^D)'1  M^D  (5.29) 

(kl 

(I  - M -'D)  is  a positive  semidefinite  matrix.  If  it  is  not  positive 
definite,  the  loss  of  k^  instrument  would  make  one  or  more  parameters  un- 
identifiable. (The  number  of  parameters  which  are  unidentifiable  equals 

fkl 

the  number  of  zero  eigenvalues  of  (I  - NT  JD)).  Assume  that  no  parameter 
becomes  unidentifiable  from  the  failure  of  the  k^  instrument. 
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Then  Dp  - D is  positive  semidefinite  with  rank  equal  to  that  of  NT  ] . 
In  other  words,  the  failure  of  the  instrument  will  reduce  the  identi- 
fiability  of  all  parameters.  Equation  (5.29)  gives  a quantitative  measure 
of  the  increase  in  covariance  of  parameter  estimation  errors. 

5.4.2  Instrument  Deterioration 


The  information  matrix  decomposition  is  useful  in  determining  the 
increase  in  estimation  errors  when  the  errors  in  instruments  increase. 
Suppose  the  power  spectral  density,  R^,  in  the  kth  instrument  increases. 
Differentiating  Equation  (5.28)  with  respect  to  and  using  Equations 
(5.23)  and  (5.27) 


3D 


3R. 


kk 


D gjp-  D 

8Rkk 


J-  D M(k^D 
Rkk 


This  can  he  written  as 


AD 


D M^D 


(5.30) 


(5.31) 


Thus,  the  increase  in  dispers 
R, , is  directly  proportional 


ion  matrix  for  a certain  relative  decrease  in 
to  . 
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5.5  DETERMINATION  OF  INSUFFICIENT  EXCITATION  AND  POOR  INPUTS 

The  determination  of  insufficient  excitation  from  poor  inputs  can  be 
done  through  evaluation  of  the  information  matrix  or  a measure  of  the 
information  matrix.  The  development  of  the  sensitivity  covariance 
accumulation  technique,  detailed  in  Chapter  III,  makes  it  possible  to 
compute  the  information  matrix  on  a real  time  basis.  This  will  be 
computed  for  the  operating  instruments  at  the  a priori  parameter  value 
or  an  approximate  estimate  of  the  parameters. 

There  are  several  problems  with  the  direct  computation  of  the 
information  matrix,  of  Chapter  III,  for  the  present  application.  In 
the  output  error  mode,  the  information  matrix  depends  only  on  the  input 
and  the  instrument  errors,  but  not  on  actually  observed  measurements. 

This  makes  it  very  important  that  the  instrument  errors  be  known  and 
also  whether  the  instruments  are  functioning  normally.  Therefore,  the 
techniques  of  the  last  two  sections  have  to  be  used  to  determine 
instrument  accuracies.  The  second  and  sometimes  a more  important 
problem  is  incorrect  parameter  values.  The  computed  information  matrix 
depends  on  the  a priori  parameter  values  and  could  be  in  error  because 
of  incorrect  parameters.  This  happends  when  the  predicted  state 
variables  are  significantly  different  from  the  true  states.  There 
are  two  methods  to  handle  this  problem. 

In  the  first  technique,  the  x(k)  on  the  right  hand  side  of 
Equation  (3.611  is  replaced  by  the  measured  value  of  the  state  variables 
at  that  instant  of  time,  i.e. , the  sensitivity  equations  are  written  as 

ST-  - ||-  y(k)  . ; * |§-  uoo  (S.32) 

J J 1 J 
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The  error  in  state  sensitivity  follows  the  equation 


W7  *(n+1>  = 


3<t> 

aeT 


vtn)  * 


(5.33) 


The  last  term  is  small  when  the  sampling  rate  is  high  and  the  parameters 
are  not  too  far  from  the  true  values.  With  this  assumption,  we  can  get 
a difference  equation  for  the  covariance  of  the  error 


A A 

Pj(n+1)  = <)>  Pj  (n)<j> 


+ 


dtfr  p ^ 
K 96j 


(5.34) 


The  number  of  sensitivity  equations  can  be  reduced  as  before  considering 
y(n)  and  u(n)  as  inputs  to  the  system.  This  technique  can  be  generalized 
to  the  case  where  measurements  of  all  state  variables  are  not  available. 


The  second  approach  is  to  estimate  the  state  vector  using  a Kalman 
filter  and  the  best  available  parameter  values.  The  process  noise  covariance 
may  have  to  be  increased  artificially  to  account  for  the  modeling  error. 

The  estimated  state  vector,  in  thus  case,  will  have  both  random  and 
systematic  errors.  In  general,  this  will  be  much  closer  to  the  true  state 
than  a purely  propagated  value.  By  incorporating  the  measurements,  the 
system  outputs  are  directly  used  in  the  determination  of  the  information 
matrix.  The  equations  are 


5^.  ♦*£!♦§§- sen,,  if:  uoo 
3 3 3 3 

x(n+l)  = <J>x(n)  + Gu(n)  + <t>k(y(n)  - Hx(n)) 

An  advantage  of  this  method  is  that  it  is  easy  to  use  when  measurements  of 
all  state  variables  are  not  available. 


(5.35) 

(5.36) 
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Consider  the  continuous  system  representation 


x = - ax  + u 


with  continuous  measurements 


(5.37) 


y = x + v (5.38) 

The  "true  value"  of  a is  1 and  the  a priori  value  is  1.5.  Consider  the 
information  about  parameters  resulting  from  an  input 


u(t)  = 6 (t) 


0 < t < 2 


The  sensitivity  equation  for  parameter  a is 


(5.39) 


d 3x  3x 

clt  9a  x ' a 3a 


(5.40) 


True  Information  Matrix 


x (t)  = e" 


- - te'1 
3a 


■ i / m 

o 

1 r t2e-2t 

" r 2 


2 -2t  -2t  -2t 

e te  e 


_ 1 fl  9 -4*1  _ 

‘ r [T  * T e J ‘ 


(5.41) 


(5.42) 


(5.43) 
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Information  Matrix  Computed  Using  A Priori  Value  of  the  Variable 


It  is  straightforward  to  show  that 


M1  = ? J7  - 1 If  e'6  " ^ 


Information  Matrix  Using  the  Measured  State  Variable 


Since 


x(t)  = e 


y(t)  = e t + v 


The  equation  governing  the  sensitivity  of  the  state  to  parameter  a 


d 9x 
3t  9a 


, r 3X  -t 
- !.5  ^ - e - v 


The  expected  value  of  (9x/9a)  is 


9x  - of  -1.5t  -to 
o-  = 2(e  - e ) 


and  its  variance  is 


Xa  = J Cl  - e-3t) 


The  expected  value  of  the  information  matrix  is 


Mz  -555 
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The  second  term  in  Equation  (5.50)  is  a bias  due  to  the  noise  term  driving 
the  sensitivity  Equation  (5.47).  For  a reasonable  signal-to-noise  ratio, 
r is  much  smaller  than  one.  If  this  is  not  so,  the  bias  may  produce  a 
serious  error  in  the  computed  information  matrix.  Notice  also  that  the 
bias  is  independent  of  the  input  and  the  response  and  therefore  can  be 
corrected  for,  as  long  as  the  measurement  noise  covariance  is  known. 

It  is  clear  that  the  second  approach  gives  a better  approximation  to 
the  information  matrix. 

Remarks : 

1.  Many  methods  (e.g.,  Denery  [24]  fall  under  the  general 
class  of  methods  presented  in  the  previous  sections.  Any 
of  these  could  also  be  used  for  the  on-line  evaluation  of 
data  quality. 

2.  Sensitivity  function  method  gives  a more  accurate 
information  matrix,  at  the  cost  of  increased  computation 
time.  This  method  is  also  more  general  and  can  be  used 
with  slight  loss  of  efficiency  when  all  measurements  are 
not  available. 

3.  The  estimate  of  the  state  based  on  previous  measurements 
need  not  be  optimal.  However,  there  may  be  a serious 
error  in  the  information  matrix  if  there  are  large 
errors  in  state  estimates. 

4.  The  instrument  checks  and  information  matrix  evaluation 
can  be  carried  out  simultaneously  on-line.  The  information 
matrix  can  be  determined  as  if  all  instruments  are 
functioning  normally.  The  kinematic  Kalman  filters  can 

be  propagated  in  parallel  to  determine  if  the  instruments 
are  functioning. 
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5.6  SCALAR  CRITERIA 


In  the  last  three  sections , we  developed  algorithms  for  quick 
determination  of  the  dispersion  matrix  for  on-line  and  real  time  use. 

To  make  more  effective  use  of  real  time  data  quality  analysis,  there  is 
a need  to  translate  the  dispersion  matrix  into  a scalar  criterion,  upon 
which  the  engineer  could  base  the  success  or  failure  of  a flight  test. 

The  trace,  the  weighted  trace  and  the  determinant  of  the  dispersion 
matrix  are  possible  measures.  The  following  recursive  relations  can  be 
used  to  update  the  trace  and  the  determinant  of  the  dispersion  matrix. 

ir(DN)  = • Tr  [°n-i  He  1 + He(V  °n-i  HeT(:V} 

He  DN-l^  C5-51 

^1  = K-i|  x I1  + Mn-i'1  r'X  VV|  C5.52; 


Updating  the  trace  or  the  determinant  of  the  dispersion  matrix  requires 
storing  the  entire  dispersion  matrix  and  updating  it.  A lower  bound  can 
be  obtained  for  the  determinant  of  the  dispersion  matrix  which  does  not 
require  that  the  entire  matrix  be  stored 


m 


(5.53; 


W ■ £ HeT(Vi>  *‘VW 


The  inequality  of  Equation  (5.53)  is  useful  when  N+k  is  not  too  small 


MICROCOPY  RESOLUTION  TEST  CHART 
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5.7  SUNMARY 


This  chapter  presented  some  results  on  the  techniques  for  on-line  evalua- 
tion of  data  quality.  The  detection  of  instrument  linear  failures  and  degrada- 
tions are  discussed  and  their  effects  on  reducing  parameter  identifiability  are 
considered.  The  problem  of  insufficient  system  excitation  is  also  treated  in 
detail.  The  use  of  these  techniques  would  lead  to  a considerable  saving  in 
flight  test  time. 
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VI . CONCLUSIONS 


A review  of  parameter  identification  methods  reveals  that  the  accuracy  and 


convergence  properties  of  off-line  algorithms  is  not  generally  characteristic 
of  on-line  and  real  time  programs.  This  work  has  produced  techniques  which 
attempt  to  bridge  the  gap  between  such  requirements  by  developing  a very 
accurate  on-line  algorithm  and  a fast  off-line  algorithm.  These  new  techniques 
are  basically  as  follows: 

(a)  A State  Variable  Instrumental  Method  which  gives  statistically  effi- 
cient and  computationally  convergent  algorithms. 

(b)  A Linear  Maximum  Likelihood  Algorithm  which  incorporates  a new  sensi- 
tivity functions  reduction  method,  producing  a near-real  time  program 
with  no  loss  of  accuracy. 


Both  of  these  techniques  have  been  evaluated  on  simulated  and  flight  test 
data.  Results  for  the  instrumental  variable  approach  (IVA)  showed  notable 
improvement  over  previous  implementation.  No  loss  in  accuracy  occurs  because 
the  sensitivity  functions  technique  upon  which  the  time  reduction  is  based,  is 
an  exact  algebraic  reduction  of  the  calculation  requirements. 


Further  reductions  in  time  are  possible  by  use  of  several  methods,  includ- 
ing the  following: 

\ ' 

(1)  Use  of  preprocessing  data  consistency  and  evaluation  algorithms  to 

optimally  filter  spurious  errors. 

E 1 

m \ 

(2)  Use  of  particular  approximations  to  various  stages  of  the  maximum 


likelihood  algorithm. 


The  analytical,  conputational , and  evaluation  results  of  this  work  lead 
to  the  following  principal  conclusions: 


(1)  Significant  savings  in  computational  time  can  be  achieved  by  using 
on-line  or  specialized  off-line  programs  to  estimate  stability  and 
control  derivatives.  Such  a capability  can  be  applied  to  effect  reduc- 
tion in  total  flight  test  time. 

(2)  The  demonstrated  capability  of  on-line  algorithms  can  be  extrapolated 
to  other  real  time  applications,  including  advanced  control  system 
and  fault  detection  objectives.  Such  extrapolation  is  consistent 
with  the  computer  hardware  advances  now  being  implemented  on  advanced 
Naval  systems. 


APPENDIX  A 

REDUCTION  IN  SENSITIVITY  FUNCTION  COMPUTATION  FOR 
LINEAR  TIME  INVARIANT  SYSTEMS* 

A. 1 INTRODUCTION 

The  problem  of  computing  state  sensitivities  using  reduced 
order  models  has  become  very  important  in  parameter  estimation 
involving  high  order  models  and  many  unknown  parameters.  These 
techniques  allow  a considerable  saving  in  computation  time  which 
makes  the  determination  of  optimal  inputs  and  estimation  of 
parameters  from  real  data  feasible  for  practical  systems.  Most 
efforts  to  date  [26-28]  have  concentrated  on  finding  bounds  on  the 
order  of  the  model  which  can  generate  state  sensitivities  for  all 
system  parameters.  Very  little  attention  has  been  paid  to  the 
formulation  of  practical  techniques  leading  to  these  lowest  order 
models.  Formulations  by  Wilkie  and  Perkins  [26],  Denery  [27], 
and  Neuman  and  Sood  [28]  lead  to  fairly  complicated  transformations 
and  are  not  capable  of  exploiting  the  characteristics  of  the  system 
in  most  cases. 


This  appendix  develops  a practical  method  for  obtaining  lowest 
order  models  for  sensitivity  functions  computations.  The 
technique  makes  full  use  of  special  system  characteristics  and 
has  general  application  to  high  order  systems  with  a large  number 
of  unknown  parameters.  The  problem  of  computing  sensitivities  for 
initial  conditions  is  generalized  to  computing  sensitivities  for 
an  additional  input  distribution  vector. 


* This  appendix  is  taken  from  Reference  25,  a paper  published 
under  this  contract. 
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This  appendix  is  organized  as  follows.  Section  A.  2 gives  a 
statement  of  the  problem,  assumptions  and  notation.  The  problem 
of  single  input  is  treated  in  Section  A. 3 and  is  generalized  for 
multi-input  systems  in  Section  A. 4.  Section  A. 5 presents  the 
algorithm  to  implement  this  technique.  The  conclusions 
and  some  further  work  is  indicated  in  Section  A. 6. 

A. 2 PROBLEM  STATEMENT 

Consider  a system 

x = Fx  + Gu  x(0)  = xq  (1) 

where  x is  an  n x 1 state  vector,  u is  a q x 1 control  vector. 

F and  G are  matrices  of  appropriate  dimensions  and  are  functions 
of  m parameters  9.  The  system  starts  from  the  initial  state  xq. 

An  alternate  representation  of  the  system  is  obtained  by  adding 
one  more  input  and  converting  the  initial  condition  to  zero,  i.e., 

x = Fx  + Gu  + xQuq+1 

A Fx  + G ' u ' x (0)  = 0 (2) 

and 

Vi  ■ s(t) 

where  6 is  the  dirac  delta  function.  This  shows  that  the  sensi- 
tivities for  the  initial  conditions  can  be  computed  in  much  the 
same  way  as  the  sensitivities  for  other  parameters  in  G.  The 
initial  condition  vector  must  be  adjoined  to  the  control  distri- 
bution matrix  even  while  computing  sensitivities  for  9 with 
known  but  non-zero  initial  condition.  Since  xQ  is  not  different 
from  parameters  in  G in  this  representation,  the  primes  will  be 
removed  in  Equation  (2).  Thus,  Equation  (1)  with  zero  initial 
condition  can  be  considered  without  loss  of  generality.  The 
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problem  is  to  compute  the  state  sensitivites  for  parameters  0 
for  all  time  in  an  efficient  manner. 


A heretofore  uncited  property  of  systems,  which  depends  on 
the  parameters  0,  is  important  in  sensitivity  computation. 

Definition  1 - -Structural  Controllability:  A system  is  said 

to  be  structurally  controllable  if  it  is  controllable  for  almost 
all  values  of  parameters.  The  system  may  be  uncontrollable  if 
certain  relations  hold  among  the  parameters. 

Definition  2 - -Structural  Linear  Dependence:  A set  of  vectors 

have  structural  linear  dependence  if  a linear  combination  of 
these  vectors  is  zero  for  almost  all  values  of  parameters.  The 
particular  linear  combination  may  depend  on  the  values  of  the 
parameters . 


Example  1:  Consider  the  system 


9i  92\  / 1\ 


3 V 


x + 


M / 


u 


(3) 


the  controllability  matrix  is 


,i  e,  ♦ e2v 

' 1 8,  * 8.  ' 


The  system  is  controllable  unless  0^  + 02  " 03  + 04’  Thus,  if 


0. 


0, 


-1  and  0. 


0. 


-5,  the  system  is  uncontrollable  in 


the  classical  sense  but  structurally  controllable. 
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Initially,  the  following  simplifications  can  be  made: 


a.  The  system  is  made  structurally  controllable  (including 
xQ  in  G)  by  dropping  uncontrollable  states.  Since  the 
initial  condition  is  zero,  the  system  never  moves  into 
the  uncontrollable  subspace.  This  reduces  the  order 

of  the  system.  Note  that  the  states  which  are 
uncontrollable  only  for  the  given  values  of  the 
parameters  but  which  are  structurally  controllable 
should  not  be  dropped. 

b.  All  structurally  linearly  dependent  columns  of  G matrix 
are  lumped  with  other  columns.  This  reduces  the  number 
of  effective  controls. 

These  two  simplifications  reduce  computations  later.  However, 
the  order  of  the  system  required  to  general  sensitivity  functions 
will  be  the  same,  even  if  these  simplifications  are  not  made. 

The  state  sensitivity  for  parameter  0^  follows  the  differ- 
ential equation 


d 9x  _ p 3x_  + UL  v 
Jt  F0T  39^  36j 


3G 

3ej 


u 


Hr  <°>  ’ 0 


(4) 


The  state  sensitivities  for  all  parameters  0 can  be  written  as 

X0  = F0X0  + G0U 
x0(O)  = 0 


102 


;i 


where 


x 

3x 

30, 


n(m  + 1)  x 1 


3x 

36 


L.  m J 


_ 

» — 

F 

3F  F 0 

361  0 F 

G 

3G 

F-  * 

. . 0 . 

G„  = 

• 

e 

• • 

30“  0 0 F 

m 

0 

• 

3G 

30 

. m J 

i 

n(tn  + 1)  x n(m  + 1) 

n(m  4 1) 

(6) 


If  (Fg , G0)  is  uncontrollable,  the  corresponding  controllability 


matrix  is  of  rank  less  than  (m+l)n,  say  r.  Let  be  the  set 
of  r independent  columns  in  the  controllability  matrix.  If  Q2 
is  such  that  Qj  and  Q2  form  a set  of  n(m+l)  linearly  independent 
vectors,  then 


-1 


xe  £ (Q! : q2>  *oA 


(7) 


\ 


follows  the  differential  equation  (see  Chen  [15]) 


• f 


(8) 
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Since  the  initial  condition  in  (8)  is  zero,  the  last  (m+l)n-r 
uncontrollable  states  remain  zero  throughout.  The  remaining  r 
states,  xc  follow  the  differential  equation 


x = F x_  + G u 
c c c c 


where 


Also 


xc(0)  = 0 


Fc  A f„  - Ql  FeQ! 

[C»V  G8 


Xe  (Qi|Q2)  xe 

' Vc 

1 + 

since  other  states  in  x0  are  zero.  Note  that  is  a pseudo- 
inverse of  depending  on  Q2-  The  transformation  from  F0 , 

G0 , to  Fc,  Gc  and  from  xQ  to  xQ  does  not  involve  Q2  explicitly. 
Therefore,  can  be  chosen  to  be  any  pseudo- inverse  of  , 
for  example, 

qJ  = CQl  Ql)'1  0 


It  is  assumed  here  that  the  inputs  are  linearly  dependent. 
If  this  is  not  so,  the  number  of  inputs  can  be  reduced  until 
they  are  linearly  independent.  This  will  usually  result  in  a 
reduction  in  the  controllable  subspace  of  (F0 , G0)  as  shown  in 
Section  A. 4.  Under  the  assumption  of  linear  independence  of 
inputs,  it  is  necessary  and  sufficient  to  solve  a system  of  r 


104 


linear  equations  (9)  to  determine  the  state  and  its  sensitivities 
at  all  times.  The  next  sections  investigate  the  nature  and 
dimension  of  the  controllable  subspace  of  (F0  , G0)  and  explore 
efficient  methods  for  finding  Qj . 

A.  3 SINGLE  INPUT  SYSTEMS 


In  single  input  systems,  F0  is  a (mxl)n  x (m+l)n  matrix  and 
G0  is  a (m+l)n  vector.  The  following  theorem  holds. 

Theorem  1:  For  a single  input  system,  the  rank  of  the  control- 

lability matrix  of  (F0  , G0)  cannot  exceed  2n. 

Proof:  The  controllability  matrix  of  (F0 , G0)  is 

ce  ” Ge: FeGe:  • • •Fe”'+1'ln"1  Ge  t13> 

It  is  easy  to  show  that 
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pBWE  - 1 1.-  w-ii  yVM1 


the  (n+k)th  column  of  CQ  is 


D(Fn+k'1G)  = E {D*Ca.I)Fk+1‘1G  + a.  D(Fk+i_1G)}  (16) 

i=0  1 1 


The  second  term  is  a linear  combination  of  n preceding  columns 
of  CQ.  Thus, 


rank  C0  = rank{D(G) : : D(Fn_iG):  E D*(a.I)F1G 


_ , n-1 

v n hr*.  T>rlr* 


• i=0 

: V D*(a1I)F1+n+^'1G;  V D*  (a  ■ I)  Fn+1G" } 

' i-°  ' ^ 1 • (17) 


The  (2n+k)th  column  of  the  right  hand  side  matrix  is 


E D*(ot,I)Fn+1  + k'1G  = E D*(a.I)  "e*  a.F1  + ;i+k‘1G 
i = 0 1 i = 0 1 j =0  J 


n-1  n-1 


E a.  E D*(a.I)F1+(j+k)'1G  (18) 

j =0  J i = 0 1 


which  is  a linear  combination  of  n previous  columns  for  k > 0. 
Therefore , 


rank  C = rank [G0 j F0G0 : Ffi2n"1Gfi]  < 2n 


9 J - 


Thus,  the  order  of  the  system  required  to  compute  all  state 
sensitivities  for  a single  input  system  cannot  exceed  2n.  In 
many  practical  cases,  it  is  smaller  as  shown  in  Example  1. 
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If  the  structurally  controllable  subspace  of 
(F,  G)  is  of  the  order  p,  the  maximal  order  of  the  controllable 
subspace  of  (Ffl , Gfl)  is  2p. 


Corollary  1 


Since  F^G  is  a linear  combination  of  G 
corollary  follows  immediately. 


Example  2:  Consider  the  following  system 


The  state  vector  and  its  sensitivities  for  0^  and  0^  form  a set 
of  six  differential  equations.  Since  the  number  of  states  is  two 
and  the  number  of  controls  is  one,  only  the  first  four  columns 
can  be  independent  in  the  controllability  matrix  of  (Fq , Gq) . 
These  columns  are 


rank(C  ) - rank  I D 


rank 


Thus,  the  procedure  for  finding  independent  columns  of  CQ 
consists  of  finding  structurally  independent  columns  of  the 
controllability  matrix  of  (F,  G) , choosing  (q+l)n  appropriate 
columns  from  C0  and  checking  to  see  if  there  is  any  further 
linear  dependence. 

Another  simplification  is  possible  in  large  order  systems 
in  which  each  input  controls  only  a small  number  of  states.  If 
is  the  dimension  of  the  controllable  subspace  of  the  i*"*1 
input,  no  more  than  2k^  columns  involving  can  be  linearly 
independent  in  the  right  hand  side  matrix  of  (29)  , as  shown 
in  corollary  1 and  theorem  1. 


Corollary  2:  If  for  any  single  input  Uj  the  system  is 

completely  controllable, 

p = rank[D(G1,...,Fn_1G1),D(G2,...,Fn"1G2) DCG^ , . . . ,Fn_1G  ) , 


ED*(ot1I)F1Gj ED*(aiI)Fn+1  LG  ] 


(30) 


The  proof  is  obvious  since  G^,  ...,  Fn  *G.  form  a set  of 
n linearly  independent  columns. 


A. 5 COMPUTATION  PROCEDURE 


A computer  program  has  been  written  to  carry  out  this 
sensitivity  functions  reduction  in  linear  constant - coeff ic ient 
systems.  The  following  procedure  is  used. 

a.  The  initial  condition,  if  nonzero,  is  appended  to  the 
input  distribution  matrix  and  the  number  of  inputs 
is  increased  by  one.  The  linearly  dependent  columns 
in  G are  merged.  Then  the  structurally  uncontrollable 
states  in  (F,  G)  are  dropped. 


Matrices  F0  and  G0  are  formed, 


^1’  ^2 ’ • * * * kq  o f 


Equation  (25)  are  determined  and  are  used  to  choose 
(q+l)n  appropriate  columns  from  the  controllability 
matrix  of  (F0 , G0). 

The  dimension  of  state  space  controllable  from 
each  input  u^  alone  is  determined.  If  for  any  i 


2k^  < n + kx  (31) 

the  last  n+k^-2kT  columns  involving  G^  in  the  right 
hand  side  matrix  of  (29)  are  dropped. 

d.  The  remaining  columns  are  checked  for  linear  inde- 
pendence. Gram-Schmidt  procedure  is  used  to  drop 
columns,  which  are  linearly  dependent  on  other 
columns.  The  set  of  remaining  columns  is  . 

e.  Any  pseudo- inverse  of  is  determined.  Equation  (12) 
is  used  to  compute  F and  G . 

f.  Equation  (9)  is  solved  for  x£(t)  and  Equation  (11) 
is  used  to  find  x0  at  the  desired  points. 

A.  6 CONCLUSIONS 

This  appendix  presents  a technique  for  determining  state 
sensitivity  functions  for  parameters  in  state  transition  matrix, 
input  distribution  matrix,  and  initial  conditions  for  linear 
time- invar iant  systems.  The  input  distribution  matrix  is 
augmented  with  the  initial  condition  vector  by  addition  of  a 
new  virtual  input.  The  state  sensitivities  for  initial  condition 
is  computed  in  the  same  way  as  the  state  sensitivities  for  other 
parameters  in  input  distribution  matrix. 
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A systematic  method  for  finding  the  controllable  subspace  of 
the  augmented  system,  in  which  the  state  vector  is  the  system 
state  and  its  sensitivities,  is  presented.  It  is  necessary  to 
start  with  no  more  than  (q+l)n  columns  of  the  controllability 
matrix  of  the  augmented  system.  These  columns  can  be  selected 
quickly  by  inspection  of  the  controllability  matrix  of  the  initial 
system.  If  r is  the  dimension  of  the  controllable  subspace  of 
augmented  system,  it  is  necessary  to  solve  r linear  equations 
to  evaluate  the  state  vector  and  its  sensitivities. 

This  method  of  sensitivity  functions  reduction  fully  exploits 
the  characteristics  of  the  system  and  that  the  sensitivity  to  all 
parameters  in  the  system  may  not  be  required.  It  leads  to  the 
minimal  order  model  under  the  circumstances. 
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